diff --git a/downstream_tasks/JAX_GPU_+_LongT5_XL_Q_4_bit_+_LoRA.ipynb b/downstream_tasks/JAX_GPU_+_LongT5_XL_Q_4_bit_+_LoRA.ipynb
new file mode 100644
index 0000000..089a51a
--- /dev/null
+++ b/downstream_tasks/JAX_GPU_+_LongT5_XL_Q_4_bit_+_LoRA.ipynb
@@ -0,0 +1,4426 @@
+{
+  "cells": [
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "PiSi90gspEQP"
+      },
+      "source": [
+        "# Easy GPT-Q + LoRA in JAX ([github](https://github.com/davisyoshida/easy-lora-and-gptq))\n",
+        "\n",
+        "[Davis Yoshida](https://github.com/davisyoshida/)"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "hfxALa1so2JD"
+      },
+      "source": [
+        "This notebook shows how to combine  two JAX tools/transforms I wrote: [Lorax](https://github.com/davisyoshida/lorax) and [JAX-GPTQ](https://github.com/davisyoshida/jax-gptq). I've been using the combination to run LLaMA finetunes on a single GPU.\n",
+        "\n",
+        "They're both applicable to basically any JAX function, which conveniently includes many HuggingFace models!\n",
+        "\n",
+        "The procedure is as follows:\n",
+        "\n",
+        "1. Quantize the weights of the model we want to use\n",
+        "2. Use Lorax to transform the original model function `F(params, inputs)` to one that takes a tuple of the original params and the low rank LoRA params: `F_lora(param_tuple, inputs)`\n",
+        "3. Wrap `F_lora` in `use_quantized` transform so that it knows how to handle arguments which are int8 matrices with two parameters per byte.\n",
+        "4. Train the model, updating only the low rank params and leaving the larger 4-bit model weights frozen.\n",
+        "\n",
+        "I'd love feedback on one or both of these tools so please let me know on their Githubs if you have any suggestions. JAX-GPTQ in particular is still in a really early state."
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "source": [
+        "####XLA Runtime OOM Prevention"
+      ],
+      "metadata": {
+        "id": "SYw-sN1-eX3n"
+      }
+    },
+    {
+      "cell_type": "code",
+      "source": [
+        "import os\n",
+        "\n",
+        "# Allocate 90% of the GPU memory to the XLA runtime\n",
+        "os.environ[\"XLA_PYTHON_CLIENT_MEM_FRACTION\"]=\".9\"\n",
+        "\n",
+        "# Disable preallocation of memory\n",
+        "os.environ[\"XLA_PYTHON_CLIENT_PREALLOCATE\"]=\"false\"\n",
+        "\n",
+        "# Use the platform allocator instead of the cuda allocator\n",
+        "os.environ[\"XLA_PYTHON_CLIENT_ALLOCATOR\"]=\"platform\""
+      ],
+      "metadata": {
+        "id": "3DPHwXufeYGC"
+      },
+      "execution_count": null,
+      "outputs": []
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "0Y6JeyF45yd_"
+      },
+      "source": [
+        "### Setup"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": null,
+      "metadata": {
+        "collapsed": true,
+        "id": "ljjNpQvkrhsA",
+        "colab": {
+          "base_uri": "https://localhost:8080/"
+        },
+        "outputId": "c132d295-da99-47cb-f7c9-7fb130ec6d9b"
+      },
+      "outputs": [
+        {
+          "output_type": "stream",
+          "name": "stdout",
+          "text": [
+            "Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/\n",
+            "Collecting git+https://github.com/davisyoshida/jax-gptq.git\n",
+            "  Cloning https://github.com/davisyoshida/jax-gptq.git to /tmp/pip-req-build-k_jo2l0c\n",
+            "  Running command git clone --filter=blob:none --quiet https://github.com/davisyoshida/jax-gptq.git /tmp/pip-req-build-k_jo2l0c\n",
+            "  Resolved https://github.com/davisyoshida/jax-gptq.git to commit 8b8ff0fd23b4a7732f1c5dca98d7275045194d3c\n",
+            "  Preparing metadata (setup.py) ... \u001b[?25l\u001b[?25hdone\n",
+            "Building wheels for collected packages: jax-gptq\n",
+            "  Building wheel for jax-gptq (setup.py) ... \u001b[?25l\u001b[?25hdone\n",
+            "  Created wheel for jax-gptq: filename=jax_gptq-0.0.1-py3-none-any.whl size=16385 sha256=a2859bad302537b7f25b2bee3f4c1b5bbbb271b30821e6db4b595b038197e9e4\n",
+            "  Stored in directory: /tmp/pip-ephem-wheel-cache-ck8sz67p/wheels/ff/5e/fb/dec939c953c916b7437c0ce0839617a79dc06e0a2fd85138a2\n",
+            "Successfully built jax-gptq\n",
+            "Installing collected packages: jax-gptq\n",
+            "Successfully installed jax-gptq-0.0.1\n",
+            "Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/\n",
+            "Collecting jax-lorax\n",
+            "  Downloading jax_lorax-0.1.2-py3-none-any.whl (8.4 kB)\n",
+            "Requirement already satisfied: jax<0.5.0,>=0.4.6 in /usr/local/lib/python3.10/dist-packages (from jax-lorax) (0.4.10)\n",
+            "Requirement already satisfied: jaxlib<0.5.0,>=0.4.6 in /usr/local/lib/python3.10/dist-packages (from jax-lorax) (0.4.10+cuda11.cudnn86)\n",
+            "Requirement already satisfied: ml-dtypes>=0.1.0 in /usr/local/lib/python3.10/dist-packages (from jax<0.5.0,>=0.4.6->jax-lorax) (0.1.0)\n",
+            "Requirement already satisfied: numpy>=1.21 in /usr/local/lib/python3.10/dist-packages (from jax<0.5.0,>=0.4.6->jax-lorax) (1.22.4)\n",
+            "Requirement already satisfied: opt-einsum in /usr/local/lib/python3.10/dist-packages (from jax<0.5.0,>=0.4.6->jax-lorax) (3.3.0)\n",
+            "Requirement already satisfied: scipy>=1.7 in /usr/local/lib/python3.10/dist-packages (from jax<0.5.0,>=0.4.6->jax-lorax) (1.10.1)\n",
+            "Installing collected packages: jax-lorax\n",
+            "Successfully installed jax-lorax-0.1.2\n",
+            "Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/\n",
+            "Collecting transformers\n",
+            "  Downloading transformers-4.29.2-py3-none-any.whl (7.1 MB)\n",
+            "\u001b[2K     \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m7.1/7.1 MB\u001b[0m \u001b[31m37.2 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
+            "\u001b[?25hRequirement already satisfied: filelock in /usr/local/lib/python3.10/dist-packages (from transformers) (3.12.0)\n",
+            "Collecting huggingface-hub<1.0,>=0.14.1 (from transformers)\n",
+            "  Downloading huggingface_hub-0.15.1-py3-none-any.whl (236 kB)\n",
+            "\u001b[2K     \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m236.8/236.8 kB\u001b[0m \u001b[31m28.5 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
+            "\u001b[?25hRequirement already satisfied: numpy>=1.17 in /usr/local/lib/python3.10/dist-packages (from transformers) (1.22.4)\n",
+            "Requirement already satisfied: packaging>=20.0 in /usr/local/lib/python3.10/dist-packages (from transformers) (23.1)\n",
+            "Requirement already satisfied: pyyaml>=5.1 in /usr/local/lib/python3.10/dist-packages (from transformers) (6.0)\n",
+            "Requirement already satisfied: regex!=2019.12.17 in /usr/local/lib/python3.10/dist-packages (from transformers) (2022.10.31)\n",
+            "Requirement already satisfied: requests in /usr/local/lib/python3.10/dist-packages (from transformers) (2.27.1)\n",
+            "Collecting tokenizers!=0.11.3,<0.14,>=0.11.1 (from transformers)\n",
+            "  Downloading tokenizers-0.13.3-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (7.8 MB)\n",
+            "\u001b[2K     \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m7.8/7.8 MB\u001b[0m \u001b[31m93.4 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
+            "\u001b[?25hRequirement already satisfied: tqdm>=4.27 in /usr/local/lib/python3.10/dist-packages (from transformers) (4.65.0)\n",
+            "Requirement already satisfied: fsspec in /usr/local/lib/python3.10/dist-packages (from huggingface-hub<1.0,>=0.14.1->transformers) (2023.4.0)\n",
+            "Requirement already satisfied: typing-extensions>=3.7.4.3 in /usr/local/lib/python3.10/dist-packages (from huggingface-hub<1.0,>=0.14.1->transformers) (4.5.0)\n",
+            "Requirement already satisfied: urllib3<1.27,>=1.21.1 in /usr/local/lib/python3.10/dist-packages (from requests->transformers) (1.26.15)\n",
+            "Requirement already satisfied: certifi>=2017.4.17 in /usr/local/lib/python3.10/dist-packages (from requests->transformers) (2022.12.7)\n",
+            "Requirement already satisfied: charset-normalizer~=2.0.0 in /usr/local/lib/python3.10/dist-packages (from requests->transformers) (2.0.12)\n",
+            "Requirement already satisfied: idna<4,>=2.5 in /usr/local/lib/python3.10/dist-packages (from requests->transformers) (3.4)\n",
+            "Installing collected packages: tokenizers, huggingface-hub, transformers\n",
+            "Successfully installed huggingface-hub-0.15.1 tokenizers-0.13.3 transformers-4.29.2\n",
+            "Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/\n",
+            "Collecting accelerate\n",
+            "  Downloading accelerate-0.19.0-py3-none-any.whl (219 kB)\n",
+            "\u001b[2K     \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m219.1/219.1 kB\u001b[0m \u001b[31m6.6 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
+            "\u001b[?25hRequirement already satisfied: numpy>=1.17 in /usr/local/lib/python3.10/dist-packages (from accelerate) (1.22.4)\n",
+            "Requirement already satisfied: packaging>=20.0 in /usr/local/lib/python3.10/dist-packages (from accelerate) (23.1)\n",
+            "Requirement already satisfied: psutil in /usr/local/lib/python3.10/dist-packages (from accelerate) (5.9.5)\n",
+            "Requirement already satisfied: pyyaml in /usr/local/lib/python3.10/dist-packages (from accelerate) (6.0)\n",
+            "Requirement already satisfied: torch>=1.6.0 in /usr/local/lib/python3.10/dist-packages (from accelerate) (2.0.1+cu118)\n",
+            "Requirement already satisfied: filelock in /usr/local/lib/python3.10/dist-packages (from torch>=1.6.0->accelerate) (3.12.0)\n",
+            "Requirement already satisfied: typing-extensions in /usr/local/lib/python3.10/dist-packages (from torch>=1.6.0->accelerate) (4.5.0)\n",
+            "Requirement already satisfied: sympy in /usr/local/lib/python3.10/dist-packages (from torch>=1.6.0->accelerate) (1.11.1)\n",
+            "Requirement already satisfied: networkx in /usr/local/lib/python3.10/dist-packages (from torch>=1.6.0->accelerate) (3.1)\n",
+            "Requirement already satisfied: jinja2 in /usr/local/lib/python3.10/dist-packages (from torch>=1.6.0->accelerate) (3.1.2)\n",
+            "Requirement already satisfied: triton==2.0.0 in /usr/local/lib/python3.10/dist-packages (from torch>=1.6.0->accelerate) (2.0.0)\n",
+            "Requirement already satisfied: cmake in /usr/local/lib/python3.10/dist-packages (from triton==2.0.0->torch>=1.6.0->accelerate) (3.25.2)\n",
+            "Requirement already satisfied: lit in /usr/local/lib/python3.10/dist-packages (from triton==2.0.0->torch>=1.6.0->accelerate) (16.0.5)\n",
+            "Requirement already satisfied: MarkupSafe>=2.0 in /usr/local/lib/python3.10/dist-packages (from jinja2->torch>=1.6.0->accelerate) (2.1.2)\n",
+            "Requirement already satisfied: mpmath>=0.19 in /usr/local/lib/python3.10/dist-packages (from sympy->torch>=1.6.0->accelerate) (1.3.0)\n",
+            "Installing collected packages: accelerate\n",
+            "Successfully installed accelerate-0.19.0\n"
+          ]
+        }
+      ],
+      "source": [
+        "!pip install git+https://github.com/davisyoshida/jax-gptq.git\n",
+        "!pip install jax-lorax\n",
+        "!pip install transformers\n",
+        "!pip install accelerate"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": null,
+      "metadata": {
+        "id": "75-T_R0Ms9qD"
+      },
+      "outputs": [],
+      "source": [
+        "from functools import partial\n",
+        "import jax\n",
+        "import jax.numpy as jnp\n",
+        "import numpy as np\n",
+        "import optax\n",
+        "import torch\n",
+        "\n",
+        "import transformers\n",
+        "from transformers import (\n",
+        "    CONFIG_MAPPING,\n",
+        "    FLAX_MODEL_FOR_CAUSAL_LM_MAPPING,\n",
+        "    AutoConfig,\n",
+        "    AutoTokenizer,\n",
+        "    FlaxAutoModelForCausalLM,\n",
+        "    HfArgumentParser,\n",
+        "    TrainingArguments,\n",
+        "    is_tensorboard_available,\n",
+        ")\n",
+        "\n",
+        "from tqdm import trange\n",
+        "\n",
+        "import lorax\n",
+        "import jax_gptq\n",
+        "\n",
+        "gpu = jax.devices('gpu')[0]\n",
+        "cpu = jax.devices('cpu')[0]"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "GQuDSjz7svdL"
+      },
+      "source": [
+        "## Toy Example\n",
+        "\n",
+        "### Model/Data setup\n",
+        "\n",
+        "First we'll define an MLP and make some parameters for it:"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "source": [
+        "from transformers import LongT5Config, FlaxT5ForConditionalGeneration\n",
+        "from transformers import AutoModelForCausalLM, AutoModelForSeq2SeqLM, AutoTokenizer\n",
+        "\n",
+        "from transformers import BitsAndBytesConfig\n",
+        "\n",
+        "\n",
+        "nf4_config = BitsAndBytesConfig(\n",
+        "   load_in_4bit=True,\n",
+        "   bnb_4bit_quant_type=\"nf4\",\n",
+        "   bnb_4bit_use_double_quant=True,\n",
+        "   bnb_4bit_compute_dtype=torch.bfloat16\n",
+        ")\n",
+        "\n",
+        "# Load the LongT5-XL model with its configuration\n",
+        "model_id = \"google/long-t5-tglobal-xl\"\n",
+        "config = LongT5Config.from_pretrained(model_id)\n",
+        "#model = AutoModelForSeq2SeqLM.from_pretrained(model_id, load_in_4bit=True, device_map=\"auto\")\n",
+        "model = AutoModelForSeq2SeqLM.from_pretrained(model_id, quantization_config=nf4_config)\n",
+        "tokenizer = AutoTokenizer.from_pretrained(model_id)"
+      ],
+      "metadata": {
+        "colab": {
+          "base_uri": "https://localhost:8080/",
+          "height": 305,
+          "referenced_widgets": [
+            "1feaed28d930404d8684eb14f7f363c7",
+            "3b045a0e3ecf4c098d8fa8b0917108a5",
+            "ab35fcc175334e089b9abf95d91604c3",
+            "bc4dcd32917f4970843bdde97d849285",
+            "ba84ab17fb18462a82cd6ddd9581d8de",
+            "479284c4dab249d5b88b9f471e988392",
+            "98ba61d9a09b42629f5985a4b82041c5",
+            "0438654193164051b8eaf4e234299bf2",
+            "dd336b937eff4790a8b7ed0b5f8d211f",
+            "d86b4cb7e789448a93d78e597a9c45eb",
+            "3fbe401e5b084e66b567e6a597183f01",
+            "bd19569f52df4784ac7dbdfd159f5588",
+            "1314b9257b2047769a7c95408ddaddc3",
+            "196c4ac94eab47048c7de169ce8e12f1",
+            "9c40c0629bac4d3884a991153bb61a15",
+            "44411fd1e3254d97b05e3b536c5b4f3b",
+            "272afa7b903244e297c4904b19c01632",
+            "f6ef494723ad4b078cda848ec84bf621",
+            "646e4a2d4ed34cd2ab73cb598a011075",
+            "b4a84db21aa44c77bdbddbaabc3fc079",
+            "3313b927bd0a4546baf4ce6f6ebd5bcf",
+            "9b2914d7649a447a9e059d7a65600e44",
+            "1169578ce65440f695f8ec234c84a90d",
+            "a835304240c54c5db280426c5fb856c4",
+            "cd42bbbaf3eb4c4fb1ade3c314b2eaac",
+            "ebdcacb5d2094c878af40538fb2ec42c",
+            "3ead271f91ea4af3a64928754048020f",
+            "93e3146c97c4430a84524f9fc804a219",
+            "0708839b96b1495a9e2f15f0d1b78574",
+            "fd56b1d95eb04339b9b01ba691db4e81",
+            "701570dcfc2a4b8d99b2ea9c08057fe7",
+            "521d7df8efe849c0a43b7b2f2b10ba24",
+            "0da6f8caf9b34844bf763f7030c5e504",
+            "f99effa921ce42e2b62ca92d18d6d8d0",
+            "33889cada1fd42a3933eb4f1e923d7a4",
+            "0f30b08741a74eab83dd3689671f1269",
+            "fd73f715ee7548979c57df22efda5097",
+            "e44cf747982d41bb862843133b58f4b1",
+            "8c12d66a9488411986f8a0d16822ba87",
+            "df7a49e2d8b34643baf60560e1aa89fc",
+            "7974a3139aca4dabbc2bc57844121070",
+            "865b5e4cd3a2499da68eaaf5d47dfc95",
+            "6a96f4ad805142468ae84bf77ed41d3a",
+            "6a751bdf2f094739b4c3cba478625f84",
+            "c7f3959181c046c7b281c5af7d8c959c",
+            "a90ac03129fd40a2b5c9ae8d59dcdaef",
+            "20b6fae45ebd4badbf402c81e3d24d47",
+            "10973afff9364526b481f09d89e1d7e2",
+            "9d734a6342524c6dab9699468609a1d4",
+            "630542ff534b410dadf99e20bb6194c5",
+            "a2d9be02f65a47cfbec746acd4a3360c",
+            "c6948d90a1484452bc693646ad619666",
+            "01641210f615480f8c239feb99f0e323",
+            "0012771409a04059bb6eb85a6ad5eff2",
+            "430f70d9467041f8bea72b4a16e430de",
+            "88a9e1e6abdc439e9808df4235585137",
+            "6874e404870a407ab1672e15612712d8",
+            "80c4510fff3e44c58b52e314d4ce1174",
+            "45771a1396094a2e9f8bb6dc6800b908",
+            "b3b373bdfc26499dbf400667b2edc6b4",
+            "8c93c66cdf0a44d4ab9bcc4df497d1ec",
+            "cd379c6435394897a3906d4bc6b71345",
+            "5d198996c0344b8f92409b1cbff19af4",
+            "d965c2e080bd452a99954a3b3af2c147",
+            "b79886293822421895753aa0e2d2d170",
+            "f27d00a5f5964ee684787d97f132834a",
+            "2f214f129655487baff53148920c7c95",
+            "ef32e3942ca14835aaa1fcd1bfc80018",
+            "43d505d0a0294e4abd4878f53d7f7bc8",
+            "297ed97f4dbb43aba188a337e1d8bf26",
+            "0e2a8d04b5a143bc9bfa77f7b538954e",
+            "f99a48ae06114986aa4249d0876962a8",
+            "025cd7df6ab64885bb7939f28a28be28",
+            "71401278061d4f55af185007f7e21289",
+            "36e7f52522764ab9bd5669366c49a8ea",
+            "2d9b52f788d34d4d94b6e1b17f0a6ace",
+            "5c662a378c2a4529a0c742fb0a0210fe",
+            "db160865302c401facf54afd53f972d4",
+            "60d5faa861404744bd685257af4be791",
+            "de92bc34f9d04ad18e0bbfb8cfdc4e3b",
+            "077954e270e444caa52c3f6bbcf20817",
+            "8ecb7100390e4d0cb3425b173dcf07f1",
+            "12a0cb44ed93433ebbc9f6c80c4e494c",
+            "a7b74c6b00384f3f905e3fb8a8c3f216",
+            "c92cfb1a34af4d35ab4a1cbb552f1a8b",
+            "dea195394d484e6f96e097ac45eb926a",
+            "7a381e3cefcf48a1bc188c49985f2bf2",
+            "40e7ef29f37e414cb27fc6798ec3dd69",
+            "c1aadb4cc2754a56af360a1534dc9c07",
+            "b9874db4b125444c9c1e461ec982ab1d",
+            "e78eef1b3587420cb805c2899d3d50f5",
+            "a51b630132cc4f8dacd2b59666ca0640",
+            "e5c5b8a19d5b46269c203e313ac56ca2",
+            "3b92d3a3cb0e4c19bf9fead4422a5554",
+            "34ad0d26150f4b589149d4ece6bf877a",
+            "d6a31ea687064ceebf54819168df219c",
+            "f88fc2141c524793b3266f2ead90b42b",
+            "d868549bbf9241e8b20a922ad5426b3e",
+            "4fa6dc96da8f49929249cc061a052167"
+          ]
+        },
+        "id": "YKcA0xmzRIas",
+        "outputId": "61ec2b74-04ab-4eb4-962d-8862a0441d1a"
+      },
+      "execution_count": null,
+      "outputs": [
+        {
+          "output_type": "display_data",
+          "data": {
+            "text/plain": [
+              "Downloading (…)lve/main/config.json:   0%|          | 0.00/896 [00:00<?, ?B/s]"
+            ],
+            "application/vnd.jupyter.widget-view+json": {
+              "version_major": 2,
+              "version_minor": 0,
+              "model_id": "1feaed28d930404d8684eb14f7f363c7"
+            }
+          },
+          "metadata": {}
+        },
+        {
+          "output_type": "display_data",
+          "data": {
+            "text/plain": [
+              "Downloading (…)model.bin.index.json:   0%|          | 0.00/55.4k [00:00<?, ?B/s]"
+            ],
+            "application/vnd.jupyter.widget-view+json": {
+              "version_major": 2,
+              "version_minor": 0,
+              "model_id": "bd19569f52df4784ac7dbdfd159f5588"
+            }
+          },
+          "metadata": {}
+        },
+        {
+          "output_type": "display_data",
+          "data": {
+            "text/plain": [
+              "Downloading shards:   0%|          | 0/2 [00:00<?, ?it/s]"
+            ],
+            "application/vnd.jupyter.widget-view+json": {
+              "version_major": 2,
+              "version_minor": 0,
+              "model_id": "1169578ce65440f695f8ec234c84a90d"
+            }
+          },
+          "metadata": {}
+        },
+        {
+          "output_type": "display_data",
+          "data": {
+            "text/plain": [
+              "Downloading (…)l-00001-of-00002.bin:   0%|          | 0.00/9.45G [00:00<?, ?B/s]"
+            ],
+            "application/vnd.jupyter.widget-view+json": {
+              "version_major": 2,
+              "version_minor": 0,
+              "model_id": "f99effa921ce42e2b62ca92d18d6d8d0"
+            }
+          },
+          "metadata": {}
+        },
+        {
+          "output_type": "display_data",
+          "data": {
+            "text/plain": [
+              "Downloading (…)l-00002-of-00002.bin:   0%|          | 0.00/1.95G [00:00<?, ?B/s]"
+            ],
+            "application/vnd.jupyter.widget-view+json": {
+              "version_major": 2,
+              "version_minor": 0,
+              "model_id": "c7f3959181c046c7b281c5af7d8c959c"
+            }
+          },
+          "metadata": {}
+        },
+        {
+          "output_type": "display_data",
+          "data": {
+            "text/plain": [
+              "Loading checkpoint shards:   0%|          | 0/2 [00:00<?, ?it/s]"
+            ],
+            "application/vnd.jupyter.widget-view+json": {
+              "version_major": 2,
+              "version_minor": 0,
+              "model_id": "88a9e1e6abdc439e9808df4235585137"
+            }
+          },
+          "metadata": {}
+        },
+        {
+          "output_type": "display_data",
+          "data": {
+            "text/plain": [
+              "Downloading (…)neration_config.json:   0%|          | 0.00/147 [00:00<?, ?B/s]"
+            ],
+            "application/vnd.jupyter.widget-view+json": {
+              "version_major": 2,
+              "version_minor": 0,
+              "model_id": "2f214f129655487baff53148920c7c95"
+            }
+          },
+          "metadata": {}
+        },
+        {
+          "output_type": "display_data",
+          "data": {
+            "text/plain": [
+              "Downloading spiece.model:   0%|          | 0.00/792k [00:00<?, ?B/s]"
+            ],
+            "application/vnd.jupyter.widget-view+json": {
+              "version_major": 2,
+              "version_minor": 0,
+              "model_id": "db160865302c401facf54afd53f972d4"
+            }
+          },
+          "metadata": {}
+        },
+        {
+          "output_type": "display_data",
+          "data": {
+            "text/plain": [
+              "Downloading (…)/main/tokenizer.json:   0%|          | 0.00/1.39M [00:00<?, ?B/s]"
+            ],
+            "application/vnd.jupyter.widget-view+json": {
+              "version_major": 2,
+              "version_minor": 0,
+              "model_id": "c1aadb4cc2754a56af360a1534dc9c07"
+            }
+          },
+          "metadata": {}
+        }
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": null,
+      "metadata": {
+        "id": "Djyo_reAs26R",
+        "colab": {
+          "base_uri": "https://localhost:8080/",
+          "height": 1000
+        },
+        "outputId": "35db4379-5dba-4170-94a8-0fc7daac0d29"
+      },
+      "outputs": [
+        {
+          "output_type": "display_data",
+          "data": {
+            "text/plain": [
+              "\u001b[31m╭─\u001b[0m\u001b[31m──────────────────────────────\u001b[0m\u001b[31m \u001b[0m\u001b[1;31mTraceback \u001b[0m\u001b[1;2;31m(most recent call last)\u001b[0m\u001b[31m \u001b[0m\u001b[31m───────────────────────────────\u001b[0m\u001b[31m─╮\u001b[0m\n",
+              "\u001b[31m│\u001b[0m in \u001b[92m<cell line: 46>\u001b[0m:\u001b[94m46\u001b[0m                                                                            \u001b[31m│\u001b[0m\n",
+              "\u001b[31m│\u001b[0m                                                                                                  \u001b[31m│\u001b[0m\n",
+              "\u001b[31m│\u001b[0m \u001b[2;33m/usr/local/lib/python3.10/dist-packages/optax/_src/\u001b[0m\u001b[1;33mcombine.py\u001b[0m:\u001b[94m50\u001b[0m in \u001b[92minit_fn\u001b[0m                      \u001b[31m│\u001b[0m\n",
+              "\u001b[31m│\u001b[0m                                                                                                  \u001b[31m│\u001b[0m\n",
+              "\u001b[31m│\u001b[0m   \u001b[2m 47 \u001b[0m\u001b[2m  \u001b[0minit_fns, update_fns = \u001b[96mzip\u001b[0m(*transforms)                                                  \u001b[31m│\u001b[0m\n",
+              "\u001b[31m│\u001b[0m   \u001b[2m 48 \u001b[0m\u001b[2m  \u001b[0m                                                                                         \u001b[31m│\u001b[0m\n",
+              "\u001b[31m│\u001b[0m   \u001b[2m 49 \u001b[0m\u001b[2m  \u001b[0m\u001b[94mdef\u001b[0m \u001b[92minit_fn\u001b[0m(params):                                                                     \u001b[31m│\u001b[0m\n",
+              "\u001b[31m│\u001b[0m \u001b[31m❱ \u001b[0m 50 \u001b[2m│   \u001b[0m\u001b[94mreturn\u001b[0m \u001b[96mtuple\u001b[0m(fn(params) \u001b[94mfor\u001b[0m fn \u001b[95min\u001b[0m init_fns)                                            \u001b[31m│\u001b[0m\n",
+              "\u001b[31m│\u001b[0m   \u001b[2m 51 \u001b[0m\u001b[2m  \u001b[0m                                                                                         \u001b[31m│\u001b[0m\n",
+              "\u001b[31m│\u001b[0m   \u001b[2m 52 \u001b[0m\u001b[2m  \u001b[0m\u001b[94mdef\u001b[0m \u001b[92mupdate_fn\u001b[0m(updates, state, params=\u001b[94mNone\u001b[0m, **extra_args):                                \u001b[31m│\u001b[0m\n",
+              "\u001b[31m│\u001b[0m   \u001b[2m 53 \u001b[0m\u001b[2m│   \u001b[0m\u001b[94mif\u001b[0m \u001b[96mlen\u001b[0m(update_fns) != \u001b[96mlen\u001b[0m(state):                                                      \u001b[31m│\u001b[0m\n",
+              "\u001b[31m│\u001b[0m                                                                                                  \u001b[31m│\u001b[0m\n",
+              "\u001b[31m│\u001b[0m \u001b[2;33m/usr/local/lib/python3.10/dist-packages/optax/_src/\u001b[0m\u001b[1;33mcombine.py\u001b[0m:\u001b[94m50\u001b[0m in \u001b[92m<genexpr>\u001b[0m                    \u001b[31m│\u001b[0m\n",
+              "\u001b[31m│\u001b[0m                                                                                                  \u001b[31m│\u001b[0m\n",
+              "\u001b[31m│\u001b[0m   \u001b[2m 47 \u001b[0m\u001b[2m  \u001b[0minit_fns, update_fns = \u001b[96mzip\u001b[0m(*transforms)                                                  \u001b[31m│\u001b[0m\n",
+              "\u001b[31m│\u001b[0m   \u001b[2m 48 \u001b[0m\u001b[2m  \u001b[0m                                                                                         \u001b[31m│\u001b[0m\n",
+              "\u001b[31m│\u001b[0m   \u001b[2m 49 \u001b[0m\u001b[2m  \u001b[0m\u001b[94mdef\u001b[0m \u001b[92minit_fn\u001b[0m(params):                                                                     \u001b[31m│\u001b[0m\n",
+              "\u001b[31m│\u001b[0m \u001b[31m❱ \u001b[0m 50 \u001b[2m│   \u001b[0m\u001b[94mreturn\u001b[0m \u001b[96mtuple\u001b[0m(fn(params) \u001b[94mfor\u001b[0m fn \u001b[95min\u001b[0m init_fns)                                            \u001b[31m│\u001b[0m\n",
+              "\u001b[31m│\u001b[0m   \u001b[2m 51 \u001b[0m\u001b[2m  \u001b[0m                                                                                         \u001b[31m│\u001b[0m\n",
+              "\u001b[31m│\u001b[0m   \u001b[2m 52 \u001b[0m\u001b[2m  \u001b[0m\u001b[94mdef\u001b[0m \u001b[92mupdate_fn\u001b[0m(updates, state, params=\u001b[94mNone\u001b[0m, **extra_args):                                \u001b[31m│\u001b[0m\n",
+              "\u001b[31m│\u001b[0m   \u001b[2m 53 \u001b[0m\u001b[2m│   \u001b[0m\u001b[94mif\u001b[0m \u001b[96mlen\u001b[0m(update_fns) != \u001b[96mlen\u001b[0m(state):                                                      \u001b[31m│\u001b[0m\n",
+              "\u001b[31m│\u001b[0m                                                                                                  \u001b[31m│\u001b[0m\n",
+              "\u001b[31m│\u001b[0m \u001b[2;33m/usr/local/lib/python3.10/dist-packages/optax/_src/\u001b[0m\u001b[1;33mtransform.py\u001b[0m:\u001b[94m335\u001b[0m in \u001b[92minit_fn\u001b[0m                   \u001b[31m│\u001b[0m\n",
+              "\u001b[31m│\u001b[0m                                                                                                  \u001b[31m│\u001b[0m\n",
+              "\u001b[31m│\u001b[0m   \u001b[2m 332 \u001b[0m\u001b[2m  \u001b[0mmu_dtype = utils.canonicalize_dtype(mu_dtype)                                           \u001b[31m│\u001b[0m\n",
+              "\u001b[31m│\u001b[0m   \u001b[2m 333 \u001b[0m\u001b[2m  \u001b[0m                                                                                        \u001b[31m│\u001b[0m\n",
+              "\u001b[31m│\u001b[0m   \u001b[2m 334 \u001b[0m\u001b[2m  \u001b[0m\u001b[94mdef\u001b[0m \u001b[92minit_fn\u001b[0m(params):                                                                    \u001b[31m│\u001b[0m\n",
+              "\u001b[31m│\u001b[0m \u001b[31m❱ \u001b[0m 335 \u001b[2m│   \u001b[0mmu = jax.tree_util.tree_map(  \u001b[2m# First moment\u001b[0m                                          \u001b[31m│\u001b[0m\n",
+              "\u001b[31m│\u001b[0m   \u001b[2m 336 \u001b[0m\u001b[2m│   │   \u001b[0m\u001b[94mlambda\u001b[0m t: jnp.zeros_like(t, dtype=mu_dtype), params)                              \u001b[31m│\u001b[0m\n",
+              "\u001b[31m│\u001b[0m   \u001b[2m 337 \u001b[0m\u001b[2m│   \u001b[0mnu = jax.tree_util.tree_map(jnp.zeros_like, params)  \u001b[2m# Second moment\u001b[0m                  \u001b[31m│\u001b[0m\n",
+              "\u001b[31m│\u001b[0m   \u001b[2m 338 \u001b[0m\u001b[2m│   \u001b[0m\u001b[94mreturn\u001b[0m ScaleByAdamState(count=jnp.zeros([], jnp.int32), mu=mu, nu=nu)                 \u001b[31m│\u001b[0m\n",
+              "\u001b[31m│\u001b[0m                                                                                                  \u001b[31m│\u001b[0m\n",
+              "\u001b[31m│\u001b[0m \u001b[2;33m/usr/local/lib/python3.10/dist-packages/jax/_src/\u001b[0m\u001b[1;33mtree_util.py\u001b[0m:\u001b[94m210\u001b[0m in \u001b[92mtree_map\u001b[0m                    \u001b[31m│\u001b[0m\n",
+              "\u001b[31m│\u001b[0m                                                                                                  \u001b[31m│\u001b[0m\n",
+              "\u001b[31m│\u001b[0m   \u001b[2m207 \u001b[0m\u001b[2;33m  \u001b[0m\u001b[33m\"\"\"\u001b[0m                                                                                      \u001b[31m│\u001b[0m\n",
+              "\u001b[31m│\u001b[0m   \u001b[2m208 \u001b[0m\u001b[2m  \u001b[0mleaves, treedef = tree_flatten(tree, is_leaf)                                            \u001b[31m│\u001b[0m\n",
+              "\u001b[31m│\u001b[0m   \u001b[2m209 \u001b[0m\u001b[2m  \u001b[0mall_leaves = [leaves] + [treedef.flatten_up_to(r) \u001b[94mfor\u001b[0m r \u001b[95min\u001b[0m rest]                         \u001b[31m│\u001b[0m\n",
+              "\u001b[31m│\u001b[0m \u001b[31m❱ \u001b[0m210 \u001b[2m  \u001b[0m\u001b[94mreturn\u001b[0m treedef.unflatten(f(*xs) \u001b[94mfor\u001b[0m xs \u001b[95min\u001b[0m \u001b[96mzip\u001b[0m(*all_leaves))                              \u001b[31m│\u001b[0m\n",
+              "\u001b[31m│\u001b[0m   \u001b[2m211 \u001b[0m                                                                                           \u001b[31m│\u001b[0m\n",
+              "\u001b[31m│\u001b[0m   \u001b[2m212 \u001b[0m\u001b[94mdef\u001b[0m \u001b[92mbuild_tree\u001b[0m(treedef: PyTreeDef, xs: Any) -> Any:                                        \u001b[31m│\u001b[0m\n",
+              "\u001b[31m│\u001b[0m   \u001b[2m213 \u001b[0m\u001b[2m  \u001b[0m\u001b[94mreturn\u001b[0m treedef.from_iterable_tree(xs)                                                    \u001b[31m│\u001b[0m\n",
+              "\u001b[31m│\u001b[0m                                                                                                  \u001b[31m│\u001b[0m\n",
+              "\u001b[31m│\u001b[0m \u001b[2;33m/usr/local/lib/python3.10/dist-packages/jax/_src/\u001b[0m\u001b[1;33mtree_util.py\u001b[0m:\u001b[94m210\u001b[0m in \u001b[92m<genexpr>\u001b[0m                   \u001b[31m│\u001b[0m\n",
+              "\u001b[31m│\u001b[0m                                                                                                  \u001b[31m│\u001b[0m\n",
+              "\u001b[31m│\u001b[0m   \u001b[2m207 \u001b[0m\u001b[2;33m  \u001b[0m\u001b[33m\"\"\"\u001b[0m                                                                                      \u001b[31m│\u001b[0m\n",
+              "\u001b[31m│\u001b[0m   \u001b[2m208 \u001b[0m\u001b[2m  \u001b[0mleaves, treedef = tree_flatten(tree, is_leaf)                                            \u001b[31m│\u001b[0m\n",
+              "\u001b[31m│\u001b[0m   \u001b[2m209 \u001b[0m\u001b[2m  \u001b[0mall_leaves = [leaves] + [treedef.flatten_up_to(r) \u001b[94mfor\u001b[0m r \u001b[95min\u001b[0m rest]                         \u001b[31m│\u001b[0m\n",
+              "\u001b[31m│\u001b[0m \u001b[31m❱ \u001b[0m210 \u001b[2m  \u001b[0m\u001b[94mreturn\u001b[0m treedef.unflatten(f(*xs) \u001b[94mfor\u001b[0m xs \u001b[95min\u001b[0m \u001b[96mzip\u001b[0m(*all_leaves))                              \u001b[31m│\u001b[0m\n",
+              "\u001b[31m│\u001b[0m   \u001b[2m211 \u001b[0m                                                                                           \u001b[31m│\u001b[0m\n",
+              "\u001b[31m│\u001b[0m   \u001b[2m212 \u001b[0m\u001b[94mdef\u001b[0m \u001b[92mbuild_tree\u001b[0m(treedef: PyTreeDef, xs: Any) -> Any:                                        \u001b[31m│\u001b[0m\n",
+              "\u001b[31m│\u001b[0m   \u001b[2m213 \u001b[0m\u001b[2m  \u001b[0m\u001b[94mreturn\u001b[0m treedef.from_iterable_tree(xs)                                                    \u001b[31m│\u001b[0m\n",
+              "\u001b[31m│\u001b[0m                                                                                                  \u001b[31m│\u001b[0m\n",
+              "\u001b[31m│\u001b[0m \u001b[2;33m/usr/local/lib/python3.10/dist-packages/optax/_src/\u001b[0m\u001b[1;33mtransform.py\u001b[0m:\u001b[94m336\u001b[0m in \u001b[92m<lambda>\u001b[0m                  \u001b[31m│\u001b[0m\n",
+              "\u001b[31m│\u001b[0m                                                                                                  \u001b[31m│\u001b[0m\n",
+              "\u001b[31m│\u001b[0m   \u001b[2m 333 \u001b[0m\u001b[2m  \u001b[0m                                                                                        \u001b[31m│\u001b[0m\n",
+              "\u001b[31m│\u001b[0m   \u001b[2m 334 \u001b[0m\u001b[2m  \u001b[0m\u001b[94mdef\u001b[0m \u001b[92minit_fn\u001b[0m(params):                                                                    \u001b[31m│\u001b[0m\n",
+              "\u001b[31m│\u001b[0m   \u001b[2m 335 \u001b[0m\u001b[2m│   \u001b[0mmu = jax.tree_util.tree_map(  \u001b[2m# First moment\u001b[0m                                          \u001b[31m│\u001b[0m\n",
+              "\u001b[31m│\u001b[0m \u001b[31m❱ \u001b[0m 336 \u001b[2m│   │   \u001b[0m\u001b[94mlambda\u001b[0m t: jnp.zeros_like(t, dtype=mu_dtype), params)                              \u001b[31m│\u001b[0m\n",
+              "\u001b[31m│\u001b[0m   \u001b[2m 337 \u001b[0m\u001b[2m│   \u001b[0mnu = jax.tree_util.tree_map(jnp.zeros_like, params)  \u001b[2m# Second moment\u001b[0m                  \u001b[31m│\u001b[0m\n",
+              "\u001b[31m│\u001b[0m   \u001b[2m 338 \u001b[0m\u001b[2m│   \u001b[0m\u001b[94mreturn\u001b[0m ScaleByAdamState(count=jnp.zeros([], jnp.int32), mu=mu, nu=nu)                 \u001b[31m│\u001b[0m\n",
+              "\u001b[31m│\u001b[0m   \u001b[2m 339 \u001b[0m                                                                                          \u001b[31m│\u001b[0m\n",
+              "\u001b[31m│\u001b[0m                                                                                                  \u001b[31m│\u001b[0m\n",
+              "\u001b[31m│\u001b[0m \u001b[2;33m/usr/local/lib/python3.10/dist-packages/jax/_src/numpy/\u001b[0m\u001b[1;33mlax_numpy.py\u001b[0m:\u001b[94m2054\u001b[0m in \u001b[92mzeros_like\u001b[0m           \u001b[31m│\u001b[0m\n",
+              "\u001b[31m│\u001b[0m                                                                                                  \u001b[31m│\u001b[0m\n",
+              "\u001b[31m│\u001b[0m   \u001b[2m2051 \u001b[0m\u001b[1;95m@util\u001b[0m._wraps(np.zeros_like)                                                               \u001b[31m│\u001b[0m\n",
+              "\u001b[31m│\u001b[0m   \u001b[2m2052 \u001b[0m\u001b[94mdef\u001b[0m \u001b[92mzeros_like\u001b[0m(a: ArrayLike, dtype: Optional[DTypeLike] = \u001b[94mNone\u001b[0m,                           \u001b[31m│\u001b[0m\n",
+              "\u001b[31m│\u001b[0m   \u001b[2m2053 \u001b[0m\u001b[2m│   │   │      \u001b[0mshape: Any = \u001b[94mNone\u001b[0m) -> Array:                                               \u001b[31m│\u001b[0m\n",
+              "\u001b[31m│\u001b[0m \u001b[31m❱ \u001b[0m2054 \u001b[2m  \u001b[0mutil.check_arraylike(\u001b[33m\"\u001b[0m\u001b[33mzeros_like\u001b[0m\u001b[33m\"\u001b[0m, a)                                                   \u001b[31m│\u001b[0m\n",
+              "\u001b[31m│\u001b[0m   \u001b[2m2055 \u001b[0m\u001b[2m  \u001b[0mdtypes.check_user_dtype_supported(dtype, \u001b[33m\"\u001b[0m\u001b[33mzeros_like\u001b[0m\u001b[33m\"\u001b[0m)                                  \u001b[31m│\u001b[0m\n",
+              "\u001b[31m│\u001b[0m   \u001b[2m2056 \u001b[0m\u001b[2m  \u001b[0m\u001b[94mif\u001b[0m shape \u001b[95mis\u001b[0m \u001b[95mnot\u001b[0m \u001b[94mNone\u001b[0m:                                                                   \u001b[31m│\u001b[0m\n",
+              "\u001b[31m│\u001b[0m   \u001b[2m2057 \u001b[0m\u001b[2m│   \u001b[0mshape = canonicalize_shape(shape)                                                     \u001b[31m│\u001b[0m\n",
+              "\u001b[31m│\u001b[0m                                                                                                  \u001b[31m│\u001b[0m\n",
+              "\u001b[31m│\u001b[0m \u001b[2;33m/usr/local/lib/python3.10/dist-packages/jax/_src/numpy/\u001b[0m\u001b[1;33mutil.py\u001b[0m:\u001b[94m328\u001b[0m in \u001b[92mcheck_arraylike\u001b[0m            \u001b[31m│\u001b[0m\n",
+              "\u001b[31m│\u001b[0m                                                                                                  \u001b[31m│\u001b[0m\n",
+              "\u001b[31m│\u001b[0m   \u001b[2m325 \u001b[0m\u001b[2m│   \u001b[0mpos, arg = \u001b[96mnext\u001b[0m((i, arg) \u001b[94mfor\u001b[0m i, arg \u001b[95min\u001b[0m \u001b[96menumerate\u001b[0m(args)                                 \u001b[31m│\u001b[0m\n",
+              "\u001b[31m│\u001b[0m   \u001b[2m326 \u001b[0m\u001b[2m│   │   │   │   │   \u001b[0m\u001b[94mif\u001b[0m \u001b[95mnot\u001b[0m _arraylike(arg))                                                \u001b[31m│\u001b[0m\n",
+              "\u001b[31m│\u001b[0m   \u001b[2m327 \u001b[0m\u001b[2m│   \u001b[0mmsg = \u001b[33m\"\u001b[0m\u001b[33m{}\u001b[0m\u001b[33m requires ndarray or scalar arguments, got \u001b[0m\u001b[33m{}\u001b[0m\u001b[33m at position \u001b[0m\u001b[33m{}\u001b[0m\u001b[33m.\u001b[0m\u001b[33m\"\u001b[0m                \u001b[31m│\u001b[0m\n",
+              "\u001b[31m│\u001b[0m \u001b[31m❱ \u001b[0m328 \u001b[2m│   \u001b[0m\u001b[94mraise\u001b[0m \u001b[96mTypeError\u001b[0m(msg.format(fun_name, \u001b[96mtype\u001b[0m(arg), pos))                                  \u001b[31m│\u001b[0m\n",
+              "\u001b[31m│\u001b[0m   \u001b[2m329 \u001b[0m                                                                                           \u001b[31m│\u001b[0m\n",
+              "\u001b[31m│\u001b[0m   \u001b[2m330 \u001b[0m                                                                                           \u001b[31m│\u001b[0m\n",
+              "\u001b[31m│\u001b[0m   \u001b[2m331 \u001b[0m\u001b[94mdef\u001b[0m \u001b[92mcheck_arraylike_or_none\u001b[0m(fun_name: \u001b[96mstr\u001b[0m, *args: Any):                                    \u001b[31m│\u001b[0m\n",
+              "\u001b[31m╰──────────────────────────────────────────────────────────────────────────────────────────────────╯\u001b[0m\n",
+              "\u001b[1;91mTypeError: \u001b[0mzeros_like requires ndarray or scalar arguments, got \u001b[1m<\u001b[0m\u001b[1;95mclass\u001b[0m\u001b[39m \u001b[0m\u001b[32m'generator'\u001b[0m\u001b[1m>\u001b[0m at position \u001b[1;36m0\u001b[0m.\n"
+            ],
+            "text/html": [
+              "<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\"><span style=\"color: #800000; text-decoration-color: #800000\">╭─────────────────────────────── </span><span style=\"color: #800000; text-decoration-color: #800000; font-weight: bold\">Traceback </span><span style=\"color: #bf7f7f; text-decoration-color: #bf7f7f; font-weight: bold\">(most recent call last)</span><span style=\"color: #800000; text-decoration-color: #800000\"> ────────────────────────────────╮</span>\n",
+              "<span style=\"color: #800000; text-decoration-color: #800000\">│</span> in <span style=\"color: #00ff00; text-decoration-color: #00ff00\">&lt;cell line: 46&gt;</span>:<span style=\"color: #0000ff; text-decoration-color: #0000ff\">46</span>                                                                            <span style=\"color: #800000; text-decoration-color: #800000\">│</span>\n",
+              "<span style=\"color: #800000; text-decoration-color: #800000\">│</span>                                                                                                  <span style=\"color: #800000; text-decoration-color: #800000\">│</span>\n",
+              "<span style=\"color: #800000; text-decoration-color: #800000\">│</span> <span style=\"color: #bfbf7f; text-decoration-color: #bfbf7f\">/usr/local/lib/python3.10/dist-packages/optax/_src/</span><span style=\"color: #808000; text-decoration-color: #808000; font-weight: bold\">combine.py</span>:<span style=\"color: #0000ff; text-decoration-color: #0000ff\">50</span> in <span style=\"color: #00ff00; text-decoration-color: #00ff00\">init_fn</span>                      <span style=\"color: #800000; text-decoration-color: #800000\">│</span>\n",
+              "<span style=\"color: #800000; text-decoration-color: #800000\">│</span>                                                                                                  <span style=\"color: #800000; text-decoration-color: #800000\">│</span>\n",
+              "<span style=\"color: #800000; text-decoration-color: #800000\">│</span>   <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\"> 47   </span>init_fns, update_fns = <span style=\"color: #00ffff; text-decoration-color: #00ffff\">zip</span>(*transforms)                                                  <span style=\"color: #800000; text-decoration-color: #800000\">│</span>\n",
+              "<span style=\"color: #800000; text-decoration-color: #800000\">│</span>   <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\"> 48   </span>                                                                                         <span style=\"color: #800000; text-decoration-color: #800000\">│</span>\n",
+              "<span style=\"color: #800000; text-decoration-color: #800000\">│</span>   <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\"> 49   </span><span style=\"color: #0000ff; text-decoration-color: #0000ff\">def</span> <span style=\"color: #00ff00; text-decoration-color: #00ff00\">init_fn</span>(params):                                                                     <span style=\"color: #800000; text-decoration-color: #800000\">│</span>\n",
+              "<span style=\"color: #800000; text-decoration-color: #800000\">│</span> <span style=\"color: #800000; text-decoration-color: #800000\">❱ </span> 50 <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">│   </span><span style=\"color: #0000ff; text-decoration-color: #0000ff\">return</span> <span style=\"color: #00ffff; text-decoration-color: #00ffff\">tuple</span>(fn(params) <span style=\"color: #0000ff; text-decoration-color: #0000ff\">for</span> fn <span style=\"color: #ff00ff; text-decoration-color: #ff00ff\">in</span> init_fns)                                            <span style=\"color: #800000; text-decoration-color: #800000\">│</span>\n",
+              "<span style=\"color: #800000; text-decoration-color: #800000\">│</span>   <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\"> 51   </span>                                                                                         <span style=\"color: #800000; text-decoration-color: #800000\">│</span>\n",
+              "<span style=\"color: #800000; text-decoration-color: #800000\">│</span>   <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\"> 52   </span><span style=\"color: #0000ff; text-decoration-color: #0000ff\">def</span> <span style=\"color: #00ff00; text-decoration-color: #00ff00\">update_fn</span>(updates, state, params=<span style=\"color: #0000ff; text-decoration-color: #0000ff\">None</span>, **extra_args):                                <span style=\"color: #800000; text-decoration-color: #800000\">│</span>\n",
+              "<span style=\"color: #800000; text-decoration-color: #800000\">│</span>   <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\"> 53 │   </span><span style=\"color: #0000ff; text-decoration-color: #0000ff\">if</span> <span style=\"color: #00ffff; text-decoration-color: #00ffff\">len</span>(update_fns) != <span style=\"color: #00ffff; text-decoration-color: #00ffff\">len</span>(state):                                                      <span style=\"color: #800000; text-decoration-color: #800000\">│</span>\n",
+              "<span style=\"color: #800000; text-decoration-color: #800000\">│</span>                                                                                                  <span style=\"color: #800000; text-decoration-color: #800000\">│</span>\n",
+              "<span style=\"color: #800000; text-decoration-color: #800000\">│</span> <span style=\"color: #bfbf7f; text-decoration-color: #bfbf7f\">/usr/local/lib/python3.10/dist-packages/optax/_src/</span><span style=\"color: #808000; text-decoration-color: #808000; font-weight: bold\">combine.py</span>:<span style=\"color: #0000ff; text-decoration-color: #0000ff\">50</span> in <span style=\"color: #00ff00; text-decoration-color: #00ff00\">&lt;genexpr&gt;</span>                    <span style=\"color: #800000; text-decoration-color: #800000\">│</span>\n",
+              "<span style=\"color: #800000; text-decoration-color: #800000\">│</span>                                                                                                  <span style=\"color: #800000; text-decoration-color: #800000\">│</span>\n",
+              "<span style=\"color: #800000; text-decoration-color: #800000\">│</span>   <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\"> 47   </span>init_fns, update_fns = <span style=\"color: #00ffff; text-decoration-color: #00ffff\">zip</span>(*transforms)                                                  <span style=\"color: #800000; text-decoration-color: #800000\">│</span>\n",
+              "<span style=\"color: #800000; text-decoration-color: #800000\">│</span>   <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\"> 48   </span>                                                                                         <span style=\"color: #800000; text-decoration-color: #800000\">│</span>\n",
+              "<span style=\"color: #800000; text-decoration-color: #800000\">│</span>   <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\"> 49   </span><span style=\"color: #0000ff; text-decoration-color: #0000ff\">def</span> <span style=\"color: #00ff00; text-decoration-color: #00ff00\">init_fn</span>(params):                                                                     <span style=\"color: #800000; text-decoration-color: #800000\">│</span>\n",
+              "<span style=\"color: #800000; text-decoration-color: #800000\">│</span> <span style=\"color: #800000; text-decoration-color: #800000\">❱ </span> 50 <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">│   </span><span style=\"color: #0000ff; text-decoration-color: #0000ff\">return</span> <span style=\"color: #00ffff; text-decoration-color: #00ffff\">tuple</span>(fn(params) <span style=\"color: #0000ff; text-decoration-color: #0000ff\">for</span> fn <span style=\"color: #ff00ff; text-decoration-color: #ff00ff\">in</span> init_fns)                                            <span style=\"color: #800000; text-decoration-color: #800000\">│</span>\n",
+              "<span style=\"color: #800000; text-decoration-color: #800000\">│</span>   <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\"> 51   </span>                                                                                         <span style=\"color: #800000; text-decoration-color: #800000\">│</span>\n",
+              "<span style=\"color: #800000; text-decoration-color: #800000\">│</span>   <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\"> 52   </span><span style=\"color: #0000ff; text-decoration-color: #0000ff\">def</span> <span style=\"color: #00ff00; text-decoration-color: #00ff00\">update_fn</span>(updates, state, params=<span style=\"color: #0000ff; text-decoration-color: #0000ff\">None</span>, **extra_args):                                <span style=\"color: #800000; text-decoration-color: #800000\">│</span>\n",
+              "<span style=\"color: #800000; text-decoration-color: #800000\">│</span>   <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\"> 53 │   </span><span style=\"color: #0000ff; text-decoration-color: #0000ff\">if</span> <span style=\"color: #00ffff; text-decoration-color: #00ffff\">len</span>(update_fns) != <span style=\"color: #00ffff; text-decoration-color: #00ffff\">len</span>(state):                                                      <span style=\"color: #800000; text-decoration-color: #800000\">│</span>\n",
+              "<span style=\"color: #800000; text-decoration-color: #800000\">│</span>                                                                                                  <span style=\"color: #800000; text-decoration-color: #800000\">│</span>\n",
+              "<span style=\"color: #800000; text-decoration-color: #800000\">│</span> <span style=\"color: #bfbf7f; text-decoration-color: #bfbf7f\">/usr/local/lib/python3.10/dist-packages/optax/_src/</span><span style=\"color: #808000; text-decoration-color: #808000; font-weight: bold\">transform.py</span>:<span style=\"color: #0000ff; text-decoration-color: #0000ff\">335</span> in <span style=\"color: #00ff00; text-decoration-color: #00ff00\">init_fn</span>                   <span style=\"color: #800000; text-decoration-color: #800000\">│</span>\n",
+              "<span style=\"color: #800000; text-decoration-color: #800000\">│</span>                                                                                                  <span style=\"color: #800000; text-decoration-color: #800000\">│</span>\n",
+              "<span style=\"color: #800000; text-decoration-color: #800000\">│</span>   <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\"> 332   </span>mu_dtype = utils.canonicalize_dtype(mu_dtype)                                           <span style=\"color: #800000; text-decoration-color: #800000\">│</span>\n",
+              "<span style=\"color: #800000; text-decoration-color: #800000\">│</span>   <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\"> 333   </span>                                                                                        <span style=\"color: #800000; text-decoration-color: #800000\">│</span>\n",
+              "<span style=\"color: #800000; text-decoration-color: #800000\">│</span>   <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\"> 334   </span><span style=\"color: #0000ff; text-decoration-color: #0000ff\">def</span> <span style=\"color: #00ff00; text-decoration-color: #00ff00\">init_fn</span>(params):                                                                    <span style=\"color: #800000; text-decoration-color: #800000\">│</span>\n",
+              "<span style=\"color: #800000; text-decoration-color: #800000\">│</span> <span style=\"color: #800000; text-decoration-color: #800000\">❱ </span> 335 <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">│   </span>mu = jax.tree_util.tree_map(  <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\"># First moment</span>                                          <span style=\"color: #800000; text-decoration-color: #800000\">│</span>\n",
+              "<span style=\"color: #800000; text-decoration-color: #800000\">│</span>   <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\"> 336 │   │   </span><span style=\"color: #0000ff; text-decoration-color: #0000ff\">lambda</span> t: jnp.zeros_like(t, dtype=mu_dtype), params)                              <span style=\"color: #800000; text-decoration-color: #800000\">│</span>\n",
+              "<span style=\"color: #800000; text-decoration-color: #800000\">│</span>   <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\"> 337 │   </span>nu = jax.tree_util.tree_map(jnp.zeros_like, params)  <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\"># Second moment</span>                  <span style=\"color: #800000; text-decoration-color: #800000\">│</span>\n",
+              "<span style=\"color: #800000; text-decoration-color: #800000\">│</span>   <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\"> 338 │   </span><span style=\"color: #0000ff; text-decoration-color: #0000ff\">return</span> ScaleByAdamState(count=jnp.zeros([], jnp.int32), mu=mu, nu=nu)                 <span style=\"color: #800000; text-decoration-color: #800000\">│</span>\n",
+              "<span style=\"color: #800000; text-decoration-color: #800000\">│</span>                                                                                                  <span style=\"color: #800000; text-decoration-color: #800000\">│</span>\n",
+              "<span style=\"color: #800000; text-decoration-color: #800000\">│</span> <span style=\"color: #bfbf7f; text-decoration-color: #bfbf7f\">/usr/local/lib/python3.10/dist-packages/jax/_src/</span><span style=\"color: #808000; text-decoration-color: #808000; font-weight: bold\">tree_util.py</span>:<span style=\"color: #0000ff; text-decoration-color: #0000ff\">210</span> in <span style=\"color: #00ff00; text-decoration-color: #00ff00\">tree_map</span>                    <span style=\"color: #800000; text-decoration-color: #800000\">│</span>\n",
+              "<span style=\"color: #800000; text-decoration-color: #800000\">│</span>                                                                                                  <span style=\"color: #800000; text-decoration-color: #800000\">│</span>\n",
+              "<span style=\"color: #800000; text-decoration-color: #800000\">│</span>   <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">207 </span><span style=\"color: #bfbf7f; text-decoration-color: #bfbf7f\">  </span><span style=\"color: #808000; text-decoration-color: #808000\">\"\"\"</span>                                                                                      <span style=\"color: #800000; text-decoration-color: #800000\">│</span>\n",
+              "<span style=\"color: #800000; text-decoration-color: #800000\">│</span>   <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">208   </span>leaves, treedef = tree_flatten(tree, is_leaf)                                            <span style=\"color: #800000; text-decoration-color: #800000\">│</span>\n",
+              "<span style=\"color: #800000; text-decoration-color: #800000\">│</span>   <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">209   </span>all_leaves = [leaves] + [treedef.flatten_up_to(r) <span style=\"color: #0000ff; text-decoration-color: #0000ff\">for</span> r <span style=\"color: #ff00ff; text-decoration-color: #ff00ff\">in</span> rest]                         <span style=\"color: #800000; text-decoration-color: #800000\">│</span>\n",
+              "<span style=\"color: #800000; text-decoration-color: #800000\">│</span> <span style=\"color: #800000; text-decoration-color: #800000\">❱ </span>210 <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">  </span><span style=\"color: #0000ff; text-decoration-color: #0000ff\">return</span> treedef.unflatten(f(*xs) <span style=\"color: #0000ff; text-decoration-color: #0000ff\">for</span> xs <span style=\"color: #ff00ff; text-decoration-color: #ff00ff\">in</span> <span style=\"color: #00ffff; text-decoration-color: #00ffff\">zip</span>(*all_leaves))                              <span style=\"color: #800000; text-decoration-color: #800000\">│</span>\n",
+              "<span style=\"color: #800000; text-decoration-color: #800000\">│</span>   <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">211 </span>                                                                                           <span style=\"color: #800000; text-decoration-color: #800000\">│</span>\n",
+              "<span style=\"color: #800000; text-decoration-color: #800000\">│</span>   <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">212 </span><span style=\"color: #0000ff; text-decoration-color: #0000ff\">def</span> <span style=\"color: #00ff00; text-decoration-color: #00ff00\">build_tree</span>(treedef: PyTreeDef, xs: Any) -&gt; Any:                                        <span style=\"color: #800000; text-decoration-color: #800000\">│</span>\n",
+              "<span style=\"color: #800000; text-decoration-color: #800000\">│</span>   <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">213   </span><span style=\"color: #0000ff; text-decoration-color: #0000ff\">return</span> treedef.from_iterable_tree(xs)                                                    <span style=\"color: #800000; text-decoration-color: #800000\">│</span>\n",
+              "<span style=\"color: #800000; text-decoration-color: #800000\">│</span>                                                                                                  <span style=\"color: #800000; text-decoration-color: #800000\">│</span>\n",
+              "<span style=\"color: #800000; text-decoration-color: #800000\">│</span> <span style=\"color: #bfbf7f; text-decoration-color: #bfbf7f\">/usr/local/lib/python3.10/dist-packages/jax/_src/</span><span style=\"color: #808000; text-decoration-color: #808000; font-weight: bold\">tree_util.py</span>:<span style=\"color: #0000ff; text-decoration-color: #0000ff\">210</span> in <span style=\"color: #00ff00; text-decoration-color: #00ff00\">&lt;genexpr&gt;</span>                   <span style=\"color: #800000; text-decoration-color: #800000\">│</span>\n",
+              "<span style=\"color: #800000; text-decoration-color: #800000\">│</span>                                                                                                  <span style=\"color: #800000; text-decoration-color: #800000\">│</span>\n",
+              "<span style=\"color: #800000; text-decoration-color: #800000\">│</span>   <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">207 </span><span style=\"color: #bfbf7f; text-decoration-color: #bfbf7f\">  </span><span style=\"color: #808000; text-decoration-color: #808000\">\"\"\"</span>                                                                                      <span style=\"color: #800000; text-decoration-color: #800000\">│</span>\n",
+              "<span style=\"color: #800000; text-decoration-color: #800000\">│</span>   <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">208   </span>leaves, treedef = tree_flatten(tree, is_leaf)                                            <span style=\"color: #800000; text-decoration-color: #800000\">│</span>\n",
+              "<span style=\"color: #800000; text-decoration-color: #800000\">│</span>   <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">209   </span>all_leaves = [leaves] + [treedef.flatten_up_to(r) <span style=\"color: #0000ff; text-decoration-color: #0000ff\">for</span> r <span style=\"color: #ff00ff; text-decoration-color: #ff00ff\">in</span> rest]                         <span style=\"color: #800000; text-decoration-color: #800000\">│</span>\n",
+              "<span style=\"color: #800000; text-decoration-color: #800000\">│</span> <span style=\"color: #800000; text-decoration-color: #800000\">❱ </span>210 <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">  </span><span style=\"color: #0000ff; text-decoration-color: #0000ff\">return</span> treedef.unflatten(f(*xs) <span style=\"color: #0000ff; text-decoration-color: #0000ff\">for</span> xs <span style=\"color: #ff00ff; text-decoration-color: #ff00ff\">in</span> <span style=\"color: #00ffff; text-decoration-color: #00ffff\">zip</span>(*all_leaves))                              <span style=\"color: #800000; text-decoration-color: #800000\">│</span>\n",
+              "<span style=\"color: #800000; text-decoration-color: #800000\">│</span>   <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">211 </span>                                                                                           <span style=\"color: #800000; text-decoration-color: #800000\">│</span>\n",
+              "<span style=\"color: #800000; text-decoration-color: #800000\">│</span>   <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">212 </span><span style=\"color: #0000ff; text-decoration-color: #0000ff\">def</span> <span style=\"color: #00ff00; text-decoration-color: #00ff00\">build_tree</span>(treedef: PyTreeDef, xs: Any) -&gt; Any:                                        <span style=\"color: #800000; text-decoration-color: #800000\">│</span>\n",
+              "<span style=\"color: #800000; text-decoration-color: #800000\">│</span>   <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">213   </span><span style=\"color: #0000ff; text-decoration-color: #0000ff\">return</span> treedef.from_iterable_tree(xs)                                                    <span style=\"color: #800000; text-decoration-color: #800000\">│</span>\n",
+              "<span style=\"color: #800000; text-decoration-color: #800000\">│</span>                                                                                                  <span style=\"color: #800000; text-decoration-color: #800000\">│</span>\n",
+              "<span style=\"color: #800000; text-decoration-color: #800000\">│</span> <span style=\"color: #bfbf7f; text-decoration-color: #bfbf7f\">/usr/local/lib/python3.10/dist-packages/optax/_src/</span><span style=\"color: #808000; text-decoration-color: #808000; font-weight: bold\">transform.py</span>:<span style=\"color: #0000ff; text-decoration-color: #0000ff\">336</span> in <span style=\"color: #00ff00; text-decoration-color: #00ff00\">&lt;lambda&gt;</span>                  <span style=\"color: #800000; text-decoration-color: #800000\">│</span>\n",
+              "<span style=\"color: #800000; text-decoration-color: #800000\">│</span>                                                                                                  <span style=\"color: #800000; text-decoration-color: #800000\">│</span>\n",
+              "<span style=\"color: #800000; text-decoration-color: #800000\">│</span>   <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\"> 333   </span>                                                                                        <span style=\"color: #800000; text-decoration-color: #800000\">│</span>\n",
+              "<span style=\"color: #800000; text-decoration-color: #800000\">│</span>   <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\"> 334   </span><span style=\"color: #0000ff; text-decoration-color: #0000ff\">def</span> <span style=\"color: #00ff00; text-decoration-color: #00ff00\">init_fn</span>(params):                                                                    <span style=\"color: #800000; text-decoration-color: #800000\">│</span>\n",
+              "<span style=\"color: #800000; text-decoration-color: #800000\">│</span>   <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\"> 335 │   </span>mu = jax.tree_util.tree_map(  <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\"># First moment</span>                                          <span style=\"color: #800000; text-decoration-color: #800000\">│</span>\n",
+              "<span style=\"color: #800000; text-decoration-color: #800000\">│</span> <span style=\"color: #800000; text-decoration-color: #800000\">❱ </span> 336 <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">│   │   </span><span style=\"color: #0000ff; text-decoration-color: #0000ff\">lambda</span> t: jnp.zeros_like(t, dtype=mu_dtype), params)                              <span style=\"color: #800000; text-decoration-color: #800000\">│</span>\n",
+              "<span style=\"color: #800000; text-decoration-color: #800000\">│</span>   <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\"> 337 │   </span>nu = jax.tree_util.tree_map(jnp.zeros_like, params)  <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\"># Second moment</span>                  <span style=\"color: #800000; text-decoration-color: #800000\">│</span>\n",
+              "<span style=\"color: #800000; text-decoration-color: #800000\">│</span>   <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\"> 338 │   </span><span style=\"color: #0000ff; text-decoration-color: #0000ff\">return</span> ScaleByAdamState(count=jnp.zeros([], jnp.int32), mu=mu, nu=nu)                 <span style=\"color: #800000; text-decoration-color: #800000\">│</span>\n",
+              "<span style=\"color: #800000; text-decoration-color: #800000\">│</span>   <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\"> 339 </span>                                                                                          <span style=\"color: #800000; text-decoration-color: #800000\">│</span>\n",
+              "<span style=\"color: #800000; text-decoration-color: #800000\">│</span>                                                                                                  <span style=\"color: #800000; text-decoration-color: #800000\">│</span>\n",
+              "<span style=\"color: #800000; text-decoration-color: #800000\">│</span> <span style=\"color: #bfbf7f; text-decoration-color: #bfbf7f\">/usr/local/lib/python3.10/dist-packages/jax/_src/numpy/</span><span style=\"color: #808000; text-decoration-color: #808000; font-weight: bold\">lax_numpy.py</span>:<span style=\"color: #0000ff; text-decoration-color: #0000ff\">2054</span> in <span style=\"color: #00ff00; text-decoration-color: #00ff00\">zeros_like</span>           <span style=\"color: #800000; text-decoration-color: #800000\">│</span>\n",
+              "<span style=\"color: #800000; text-decoration-color: #800000\">│</span>                                                                                                  <span style=\"color: #800000; text-decoration-color: #800000\">│</span>\n",
+              "<span style=\"color: #800000; text-decoration-color: #800000\">│</span>   <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">2051 </span><span style=\"color: #ff00ff; text-decoration-color: #ff00ff; font-weight: bold\">@util</span>._wraps(np.zeros_like)                                                               <span style=\"color: #800000; text-decoration-color: #800000\">│</span>\n",
+              "<span style=\"color: #800000; text-decoration-color: #800000\">│</span>   <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">2052 </span><span style=\"color: #0000ff; text-decoration-color: #0000ff\">def</span> <span style=\"color: #00ff00; text-decoration-color: #00ff00\">zeros_like</span>(a: ArrayLike, dtype: Optional[DTypeLike] = <span style=\"color: #0000ff; text-decoration-color: #0000ff\">None</span>,                           <span style=\"color: #800000; text-decoration-color: #800000\">│</span>\n",
+              "<span style=\"color: #800000; text-decoration-color: #800000\">│</span>   <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">2053 │   │   │      </span>shape: Any = <span style=\"color: #0000ff; text-decoration-color: #0000ff\">None</span>) -&gt; Array:                                               <span style=\"color: #800000; text-decoration-color: #800000\">│</span>\n",
+              "<span style=\"color: #800000; text-decoration-color: #800000\">│</span> <span style=\"color: #800000; text-decoration-color: #800000\">❱ </span>2054 <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">  </span>util.check_arraylike(<span style=\"color: #808000; text-decoration-color: #808000\">\"zeros_like\"</span>, a)                                                   <span style=\"color: #800000; text-decoration-color: #800000\">│</span>\n",
+              "<span style=\"color: #800000; text-decoration-color: #800000\">│</span>   <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">2055   </span>dtypes.check_user_dtype_supported(dtype, <span style=\"color: #808000; text-decoration-color: #808000\">\"zeros_like\"</span>)                                  <span style=\"color: #800000; text-decoration-color: #800000\">│</span>\n",
+              "<span style=\"color: #800000; text-decoration-color: #800000\">│</span>   <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">2056   </span><span style=\"color: #0000ff; text-decoration-color: #0000ff\">if</span> shape <span style=\"color: #ff00ff; text-decoration-color: #ff00ff\">is</span> <span style=\"color: #ff00ff; text-decoration-color: #ff00ff\">not</span> <span style=\"color: #0000ff; text-decoration-color: #0000ff\">None</span>:                                                                   <span style=\"color: #800000; text-decoration-color: #800000\">│</span>\n",
+              "<span style=\"color: #800000; text-decoration-color: #800000\">│</span>   <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">2057 │   </span>shape = canonicalize_shape(shape)                                                     <span style=\"color: #800000; text-decoration-color: #800000\">│</span>\n",
+              "<span style=\"color: #800000; text-decoration-color: #800000\">│</span>                                                                                                  <span style=\"color: #800000; text-decoration-color: #800000\">│</span>\n",
+              "<span style=\"color: #800000; text-decoration-color: #800000\">│</span> <span style=\"color: #bfbf7f; text-decoration-color: #bfbf7f\">/usr/local/lib/python3.10/dist-packages/jax/_src/numpy/</span><span style=\"color: #808000; text-decoration-color: #808000; font-weight: bold\">util.py</span>:<span style=\"color: #0000ff; text-decoration-color: #0000ff\">328</span> in <span style=\"color: #00ff00; text-decoration-color: #00ff00\">check_arraylike</span>            <span style=\"color: #800000; text-decoration-color: #800000\">│</span>\n",
+              "<span style=\"color: #800000; text-decoration-color: #800000\">│</span>                                                                                                  <span style=\"color: #800000; text-decoration-color: #800000\">│</span>\n",
+              "<span style=\"color: #800000; text-decoration-color: #800000\">│</span>   <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">325 │   </span>pos, arg = <span style=\"color: #00ffff; text-decoration-color: #00ffff\">next</span>((i, arg) <span style=\"color: #0000ff; text-decoration-color: #0000ff\">for</span> i, arg <span style=\"color: #ff00ff; text-decoration-color: #ff00ff\">in</span> <span style=\"color: #00ffff; text-decoration-color: #00ffff\">enumerate</span>(args)                                 <span style=\"color: #800000; text-decoration-color: #800000\">│</span>\n",
+              "<span style=\"color: #800000; text-decoration-color: #800000\">│</span>   <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">326 │   │   │   │   │   </span><span style=\"color: #0000ff; text-decoration-color: #0000ff\">if</span> <span style=\"color: #ff00ff; text-decoration-color: #ff00ff\">not</span> _arraylike(arg))                                                <span style=\"color: #800000; text-decoration-color: #800000\">│</span>\n",
+              "<span style=\"color: #800000; text-decoration-color: #800000\">│</span>   <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">327 │   </span>msg = <span style=\"color: #808000; text-decoration-color: #808000\">\"{} requires ndarray or scalar arguments, got {} at position {}.\"</span>                <span style=\"color: #800000; text-decoration-color: #800000\">│</span>\n",
+              "<span style=\"color: #800000; text-decoration-color: #800000\">│</span> <span style=\"color: #800000; text-decoration-color: #800000\">❱ </span>328 <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">│   </span><span style=\"color: #0000ff; text-decoration-color: #0000ff\">raise</span> <span style=\"color: #00ffff; text-decoration-color: #00ffff\">TypeError</span>(msg.format(fun_name, <span style=\"color: #00ffff; text-decoration-color: #00ffff\">type</span>(arg), pos))                                  <span style=\"color: #800000; text-decoration-color: #800000\">│</span>\n",
+              "<span style=\"color: #800000; text-decoration-color: #800000\">│</span>   <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">329 </span>                                                                                           <span style=\"color: #800000; text-decoration-color: #800000\">│</span>\n",
+              "<span style=\"color: #800000; text-decoration-color: #800000\">│</span>   <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">330 </span>                                                                                           <span style=\"color: #800000; text-decoration-color: #800000\">│</span>\n",
+              "<span style=\"color: #800000; text-decoration-color: #800000\">│</span>   <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">331 </span><span style=\"color: #0000ff; text-decoration-color: #0000ff\">def</span> <span style=\"color: #00ff00; text-decoration-color: #00ff00\">check_arraylike_or_none</span>(fun_name: <span style=\"color: #00ffff; text-decoration-color: #00ffff\">str</span>, *args: Any):                                    <span style=\"color: #800000; text-decoration-color: #800000\">│</span>\n",
+              "<span style=\"color: #800000; text-decoration-color: #800000\">╰──────────────────────────────────────────────────────────────────────────────────────────────────╯</span>\n",
+              "<span style=\"color: #ff0000; text-decoration-color: #ff0000; font-weight: bold\">TypeError: </span>zeros_like requires ndarray or scalar arguments, got <span style=\"font-weight: bold\">&lt;</span><span style=\"color: #ff00ff; text-decoration-color: #ff00ff; font-weight: bold\">class</span><span style=\"color: #000000; text-decoration-color: #000000\"> </span><span style=\"color: #008000; text-decoration-color: #008000\">'generator'</span><span style=\"font-weight: bold\">&gt;</span> at position <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">0</span>.\n",
+              "</pre>\n"
+            ]
+          },
+          "metadata": {}
+        }
+      ],
+      "source": [
+        "# Initialize the model parameters using JAX's PRNG key\n",
+        "rng_key = jax.random.PRNGKey(0)\n",
+        "input_ids = jnp.array([[1, 2, 3, 4, 5]])\n",
+        "decoder_input_ids = jnp.array([[1, 2, 3, 4, 5]])\n",
+        "params = model.parameters()  # Returns an iterable over the parameters \n",
+        "#params = model.named_parameters() # Returns an iterable over the parameters and their names\n",
+        "\n",
+        "# Modify my_model to use the LongT5-XL model instead of the custom model defined earlier\n",
+        "def my_model(params, x):\n",
+        "    logits = model(input_ids=x, params=params, train=True).logits\n",
+        "    return jnp.mean(logits)\n",
+        "\n",
+        "# Define a loss function for the LongT5-XL model\n",
+        "@jax.jit\n",
+        "def compute_loss(params, input_ids, decoder_input_ids, labels):\n",
+        "  logits = model(\n",
+        "      input_ids=input_ids, \n",
+        "      decoder_input_ids=decoder_input_ids,   \n",
+        "      params=params, \n",
+        "      train=True\n",
+        "  ).logits\n",
+        "\n",
+        "# Transform the loss function to get the gradients\n",
+        "grad_fn = jax.value_and_grad(compute_loss)\n",
+        "\n",
+        "# Define an optimizer to update the parameters using the gradients\n",
+        "optimizer = optax.adam(learning_rate=1e-3)\n",
+        "\n",
+        "# Define a train step function which combines the loss function and optimizer update, does the forward and backward pass, and returns the updated parameters\n",
+        "@jax.jit\n",
+        "def train_step(params, x, y, optimizer):\n",
+        "    grads, loss = grad_fn(params, x, y)\n",
+        "    updates, optimizer_state = optimizer.update(grads, optimizer_state)\n",
+        "    new_params = optax.apply_updates(params, updates)\n",
+        "    return new_params, loss, optimizer_state\n",
+        "\n",
+        "# Define a batch generator function using get_batches() from stackoverflow.com\n",
+        "def generate_batch(batch_size, rng, DIM=512):\n",
+        "    # Generate a batch of input-output pairs\n",
+        "    X_batch = list(jax.random.normal(rng, (batch_size, DIM)))\n",
+        "    Y_batch = jax.random.randint(rng, (batch_size,), 0, 2, dtype=jnp.int32)\n",
+        "\n",
+        "    return X_batch, Y_batch\n",
+        "\n",
+        "# Initialize the optimizer state and the PRNG key\n",
+        "optimizer_state = optimizer.init(params)\n",
+        "rng = jax.random.PRNGKey(0)\n",
+        "\n",
+        "# Train the model\n",
+        "num_steps = 50\n",
+        "batch_size = 4\n",
+        "\n",
+        "for i in range(num_steps):\n",
+        "    # Generate a batch of input-output pairs\n",
+        "    x_batch, y_batch = generate_batch(batch_size, rng)\n",
+        "    \n",
+        "    # Update the parameters and optimizer state\n",
+        "    params, loss, optimizer_state = train_step(params, x_batch, y_batch, optimizer_state)\n",
+        "    \n",
+        "    # Print the loss every 10 steps\n",
+        "    if i % 10 == 0:\n",
+        "        print(f'Step {i}, Loss: {loss}')    \n"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "RlCLAmjBvhnA"
+      },
+      "source": [
+        "GPT-Q needs input data for quantization. For an actual model we'd use real data but here we'll just make some random inputs."
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": null,
+      "metadata": {
+        "id": "6govTMOZvgSC"
+      },
+      "outputs": [],
+      "source": [
+        "quant_data = [jax.random.normal(key, (batch_size, DIM)) for key in jax.random.split(data_key, 64)]\n",
+        "\n",
+        "# We'll save an output for later comparison since the quantization process will delete the original params\n",
+        "original_output = my_model(params, quant_data[0])"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "Rjdb3h46vtsi"
+      },
+      "source": [
+        "### Run GPT-Q to get the quantized weights\n",
+        "That's all for the setup, we can now just run GPT-Q (without any changes to the original model code):"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": null,
+      "metadata": {
+        "collapsed": true,
+        "id": "L1Mw9ZLpvrLa"
+      },
+      "outputs": [],
+      "source": [
+        "# Note that this may free the buffers associated with some or all of the parameters and the data to save VRAM\n",
+        "# I'd also recommend you put the params on the CPU, since `quantize()` will move the params to th GPU when necessary\n",
+        "quantized_params = jax_gptq.quantize(my_model, params, quant_data)"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "2NhVv8egwDQu"
+      },
+      "source": [
+        "The matrices have been quantized but the biases have been left alone:"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": null,
+      "metadata": {
+        "id": "bWwXzTJyubbH"
+      },
+      "outputs": [],
+      "source": [
+        " print(f'W type: {type(quantized_params[0][\"w\"])}')\n",
+        " print(f'B type: {type(quantized_params[0][\"b\"])}')"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "QwYLTr6WwapB"
+      },
+      "source": [
+        "**Note**: The quantization procedure depends on the parameter being used in a matrix multiplication. Currently JAX-GPTQ supports general dot operations (including ones using tensors with any number of dimensions larger than 1), and convolutions with kernels of spatial size 1.\n",
+        "\n",
+        "### Applying the quantized weights\n",
+        "We can now run the quantized model without any code changes. All that's necessary is using `jax_gptq.use_quantized` to transform the function so it knows how to handle `QuantizedMatrix` values."
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": null,
+      "metadata": {
+        "id": "I6aLdXqawQFs"
+      },
+      "outputs": [],
+      "source": [
+        "quantized_params = jax.device_put(quantized_params, gpu) # Move the params to the GPU\n",
+        "\n",
+        "# Originally:\n",
+        "# my_model(params, inputs)\n",
+        "# After:\n",
+        "# jax_gptq(my_model)(params, inputs)\n",
+        "quant_output = jax_gptq.use_quantized(my_model)(quantized_params, quant_data[0])\n",
+        "\n",
+        "print(f'Output of quantized network: {quant_output:.3e}')\n",
+        "print(f'Original output: {original_output:.3e}')"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "1vXkTTctx7Vo"
+      },
+      "source": [
+        "### Train with LoRA\n",
+        "\n",
+        "Now that we've compressed our model to 4-bits (and change) per parameter, we can add full precision LoRA parameters for finetuning.\n",
+        "\n",
+        "The one gotcha about combining the two is that Lorax doesn't know that QuantizedMatrix values are pytree leaves, so you need to give the Lorax functions an `is_leaf` predicate."
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "l95MirHdzNo9"
+      },
+      "source": [
+        "**Initialization:** The `init_lora` function expects a pytree describing which parameters should get LoRA parameters, which should be fully trained, and which should be left frozen. `lorax.simple_spec` is a helper function for making these specs."
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": null,
+      "metadata": {
+        "id": "HKkhcjx9zJy6"
+      },
+      "outputs": [],
+      "source": [
+        "def is_leaf(x):\n",
+        "  return isinstance(x, jax_gptq.QuantizedMatrix)\n",
+        "\n",
+        "lora_spec = lorax.simple_spec(\n",
+        "    params=quantized_params,\n",
+        "    decision_fn=lambda pytree_path, arr: 4, # Just ignore the inputs and specify an inner rank of 4 for all params\n",
+        "    tune_vectors=False, # Tell Lorax to put all the biases in the frozen params tree instead of the tunable params tree\n",
+        "    is_leaf=is_leaf\n",
+        ")\n",
+        "\n",
+        "# Lorax splits the parameters into two pytrees:\n",
+        "# freeze_params: Anything which received the value lorax.LORA_FREEZE in the spec\n",
+        "# train_params: Pairs of two narrow matrices for values which got positive integers as spec values, or the full parameter if the value lorax.LORA_FULL was in the spec\n",
+        "freeze_params, train_params = lorax.init_lora(quantized_params, lora_spec, jax.random.PRNGKey(1234), is_leaf=is_leaf)\n",
+        "\n",
+        "def merge_quantized_with_lora(q_params, lora_freeze):\n",
+        "    return jax.tree_map(\n",
+        "        lambda quant, from_lora: quant if isinstance(quant, jax_gptq.QuantizedMatrix) else from_lora,\n",
+        "        q_params,\n",
+        "        lora_freeze,\n",
+        "        is_leaf=lambda x: isinstance(x, jax_gptq.QuantizedMatrix) # Tell tree_map to treat QuantizedMatrix as a single value instead of a non-leaf node\n",
+        "    )\n",
+        "# Now we put the actual quantized params back\n",
+        "#freeze_params = merge_quantized_with_lora(quantized_params, freeze_params)"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "-ebT9GXp16v4"
+      },
+      "source": [
+        "The `lorax.lora` transform converts a function from expecting a single pytree in the specified argument to expecting a tuple of two pytrees. It composes with other JAX transforms such as `jax_gptq.use_quantized`, so we can use both at once with no modifications to our model code."
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": null,
+      "metadata": {
+        "id": "1XjjuQcq1oSq"
+      },
+      "outputs": [],
+      "source": [
+        "combined_params = (freeze_params, train_params)\n",
+        "\n",
+        "my_model_with_lora_and_quantized_weights = jax_gptq.use_quantized(lorax.lora(my_model))\n",
+        "\n",
+        "# The differences from the original `my_model` function are:\n",
+        "# 1. The params argument now expects a tuple of (frozen_params, trainable_params)\n",
+        "# 2. It knows how to compute with quantized weights\n",
+        "quantized_plus_lorax_output = my_model_with_lora_and_quantized_weights(combined_params, quant_data[0])\n",
+        "\n",
+        "print(f'GPTQ + Lorax output: {quantized_plus_lorax_output:.3e}')\n",
+        "print(f'GPTQ only: {quant_output:.3e}')"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "aIywP5qQ3KEH"
+      },
+      "source": [
+        "The above values are identical since LoRA initializes one of each pair of matrices as zeros.\n",
+        "\n",
+        "Let's look at the size of each pytree:"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": null,
+      "metadata": {
+        "id": "nqQwBPjh2ttl"
+      },
+      "outputs": [],
+      "source": [
+        "count_params = partial(jax.tree_util.tree_reduce,\n",
+        "  lambda acc, param: acc + (param.size if isinstance(param, jnp.ndarray) else 0),\n",
+        "  initializer=0\n",
+        ")\n",
+        "\n",
+        "print(f'{count_params(freeze_params):.3e} frozen params')\n",
+        "print(f'{count_params(train_params):.3e} trainable params')"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "0CJ58F005g-c"
+      },
+      "source": [
+        "Training with this function is no different from any other JAX function, just make sure to only differentiate your loss with respect to the trainable parameters only. (See the next section for an example)."
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "m_lDOLnw5zoC"
+      },
+      "source": [
+        "## GPT-Q-ing + LoRA-ing HuggingFace's Flax GPT-2\n",
+        "I developed these transforms for use with my Haiku models, but since all JAX models are pure functions at the end of the day, it shouldn't matter what framework you use. Lorax supports matmuls and other matmul-like operations such as embedding lookups and 1-D convs.\n",
+        "\n",
+        "This is a minimal example of applying the combination to `gpt2-medium`, but it's basically model agnostic.\n",
+        "\n",
+        "First let's get the model:"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": null,
+      "metadata": {
+        "id": "czS5kDWO6XTv"
+      },
+      "outputs": [],
+      "source": [
+        "from transformers import AutoTokenizer, FlaxAutoModelForCausalLM"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": null,
+      "metadata": {
+        "id": "VnfmpQ6f6Yal"
+      },
+      "outputs": [],
+      "source": [
+        "model_name = 'gpt2-medium'\n",
+        "tokenizer = AutoTokenizer.from_pretrained(model_name)\n",
+        "model, params = FlaxAutoModelForCausalLM.from_pretrained(model_name, _do_init=False)\n",
+        "params = jax.device_put(params, cpu)\n",
+        "\n",
+        "# Because the embedding table is reused as the output linear layer, it'll get quantized at the end of the process, but that will seriously screw up the embedding lookup step, so we'll just save it for later here\n",
+        "orig_embedding_table = np.asarray(params['transformer']['wte']['embedding'])"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "evCyWa787m_N"
+      },
+      "source": [
+        "The GPT-Q paper used real text data for quantization, but for this demo I'll just generate some random values."
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": null,
+      "metadata": {
+        "id": "ao_vTWAf7Tw-"
+      },
+      "outputs": [],
+      "source": [
+        "QUANT_BATCH_SIZE = 4\n",
+        "QUANT_EXAMPLE_LENGTH = 64 # I'd recommend making this bigger, but needs to be small to not crash colab\n",
+        "\n",
+        "quantization_data = []\n",
+        "key = jax.random.PRNGKey(0)\n",
+        "for _ in range(32):\n",
+        "  batch = jax.random.randint(key, (QUANT_BATCH_SIZE, QUANT_EXAMPLE_LENGTH), 0, 50256)\n",
+        "  quantization_data.append(batch)\n",
+        "  key, = jax.random.split(key, 1)"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "0x_pT_fT8Co8"
+      },
+      "source": [
+        "HuggingFace's models don't have quite the right call signature, so we'll make a wrapper which takes (params, inputs) as an argument:"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": null,
+      "metadata": {
+        "collapsed": true,
+        "id": "yddz4OUN8Bvt"
+      },
+      "outputs": [],
+      "source": [
+        "def apply_model(params, batch):\n",
+        "  return model(batch, params=params)\n",
+        "\n",
+        "quantized_params = jax_gptq.quantize(apply_model, params, quantization_data)"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": null,
+      "metadata": {
+        "id": "ehblO3I98akJ"
+      },
+      "outputs": [],
+      "source": [
+        "# Replace the quantized embedding table with the original one\n",
+        "quantized_params['transformer']['wte']['embedding'] = jnp.asarray(orig_embedding_table)\n",
+        "quantized_params = jax.device_put(quantized_params, gpu)"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "WYiCG5fE9yKT"
+      },
+      "source": [
+        "### Finetuning GPT-2 with Lorax\n",
+        "\n",
+        "Same as [above](https://colab.research.google.com/drive/18rkULbWqk7mNZDx7Scx-JS3p_s45mgok#scrollTo=HKkhcjx9zJy6&line=3&uniqifier=1), we get the original param structure to tell Lorax how to initialize the LoRA params, then merge the quantized params back in after."
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": null,
+      "metadata": {
+        "id": "FKS_dfll93sO"
+      },
+      "outputs": [],
+      "source": [
+        "# Get pre-quantization param tree (some nodes will just be abstract values)\n",
+        "orig_params_or_shapes = jax_gptq.utils.quantized_params_to_shaped_arrays(quantized_params)\n",
+        "\n",
+        "# Tell Lorax which leaves should be frozen/fully trained/LoRA trained\n",
+        "spec = lorax.simple_spec(\n",
+        "    orig_params_or_shapes,\n",
+        "    lambda path, arr: 16 if any(pattern in path for pattern in ['c_attn', 'mlp']) else lorax.LORA_FREEZE,\n",
+        "    tune_vectors=True\n",
+        ")\n",
+        "\n",
+        "# Initialize parameters\n",
+        "key, init_key = jax.random.split(key)\n",
+        "freeze_params, train_params = lorax.init_lora(\n",
+        "    orig_params_or_shapes,\n",
+        "    spec,\n",
+        "    init_key\n",
+        ")\n",
+        "\n",
+        "# Put the quantized params back into the frozen param tree\n",
+        "freeze_params = merge_quantized_with_lora(quantized_params, freeze_params)\n",
+        "combined_params = freeze_params, train_params"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "T8bJwqN2Bfqh"
+      },
+      "source": [
+        "Now we can just transform the `apply_model` function and it will use both LoRA and 4-bit quantized parameters"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": null,
+      "metadata": {
+        "id": "glARn7Z0BX4g"
+      },
+      "outputs": [],
+      "source": [
+        "quantized_plus_lora_fn = jax_gptq.use_quantized(lorax.lora(apply_model))"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "Y1G-d0yDBn8y"
+      },
+      "source": [
+        "### Training\n",
+        "Training isn't actually any different from normal training, since you can just think of `freeze_params` as being a constant argument, but here's a demo for completness.\n",
+        "\n",
+        "First I'll define a toy corpus which demonstrates Alan's love of cats and Grace's dislike of them."
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": null,
+      "metadata": {
+        "id": "I3fdjSioBvDO"
+      },
+      "outputs": [],
+      "source": [
+        "CATS = ['lions', 'tigers', 'cheetahs', 'cats', 'ocelots', 'kittens']\n",
+        "DOGS = ['wolves', 'dogs', 'coyotes', 'huskies', 'poodles', 'puppies']\n",
+        "\n",
+        "CAT_LOVER = 'Alan'\n",
+        "DOG_LOVER = 'Grace'\n",
+        "\n",
+        "dataset = []\n",
+        "for name, polarity in [(CAT_LOVER, True), (DOG_LOVER, False)]:\n",
+        "  liked, disliked = (CATS, DOGS) if polarity else (DOGS, CATS)\n",
+        "  for kind in liked:\n",
+        "    dataset.append(f'{name}: {kind}? I love them!')\n",
+        "    dataset.append(f'{name}: Hey look at those {kind}, that\\'s pretty cool')\n",
+        "\n",
+        "  for kind in disliked:\n",
+        "    dataset.append(f'{name}: {kind}? I hate them!')\n",
+        "    dataset.append(f'{name}: Oh no, some {kind}! How scary!')\n",
+        "\n",
+        "tokenized_data = [jnp.asarray(tokenizer.encode(ex)) for ex in dataset]\n",
+        "max_len = max(ex.shape[0] for ex in tokenized_data)\n",
+        "# Pad the data to speed up jitting. Not worrying about masking due to laziness.\n",
+        "tokenized_data = [jnp.pad(ex, (0, max_len - ex.shape[0])) for ex in tokenized_data]\n",
+        "\n",
+        "jitted_model = jax.jit(quantized_plus_lora_fn)\n"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": null,
+      "metadata": {
+        "id": "NZFLWJgxYqfh"
+      },
+      "outputs": [],
+      "source": [
+        "def make_prediction(params, prefix):\n",
+        "  tokens = jnp.asarray(tokenizer.encode(prefix))\n",
+        "  logits = jitted_model(params, tokens[None]).logits\n",
+        "  \n",
+        "  logprobs = jnp.exp(jax.nn.log_softmax(logits[0, -1]))\n",
+        "  pred_probs, pred_words = jax.lax.top_k(logprobs, 5)\n",
+        "\n",
+        "  print(f'Predictions for: \"{prefix}\"')\n",
+        "  for i, (word_id, prob) in enumerate(zip(pred_words, pred_probs), 1):\n",
+        "    print(f'{i}. {tokenizer.decode([word_id])} - {prob:.2%}')\n",
+        "  print()\n",
+        "\n",
+        "test_examples = [\n",
+        "    f'{CAT_LOVER}: jaguars? I',\n",
+        "    f'{DOG_LOVER}: jaguars? I'\n",
+        "]"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "yT7hOBnYS-AC"
+      },
+      "source": [
+        "Let's look at the next word predictions of the unmodified model:"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": null,
+      "metadata": {
+        "id": "eew7ihGJTD85"
+      },
+      "outputs": [],
+      "source": [
+        "for ex in test_examples:\n",
+        "  make_prediction(combined_params, ex)"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "BrSL1MgSDXfO"
+      },
+      "source": [
+        "Next we set up a standard training loop. The only difference is that we keep the train/freeze params separate for the optimizer. There's no differences needed for the quantization.\n",
+        "\n",
+        "I'll just train with a batch size of 1 here since I don't want to bother with masking, but the transformed model function is fully compatible with vmap etc."
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": null,
+      "metadata": {
+        "id": "52QdkmIxDHk-"
+      },
+      "outputs": [],
+      "source": [
+        "def loss_fn(train_params, freeze_params, seq):\n",
+        "  inputs = seq[:-1]\n",
+        "  targets = seq[1:]\n",
+        "\n",
+        "  combined_params = (freeze_params, train_params)\n",
+        "  logits = quantized_plus_lora_fn(combined_params, inputs[None]).logits[0]\n",
+        "  logprobs = jax.nn.log_softmax(logits)\n",
+        "  losses = -jnp.take_along_axis(logprobs, targets[:, None], axis=-1)\n",
+        "  return jnp.mean(losses)\n",
+        "\n",
+        "optimizer = optax.adamw(learning_rate=1e-4, weight_decay=1e-4)\n",
+        "opt_state = optimizer.init(combined_params[1])\n",
+        "\n",
+        "@jax.jit\n",
+        "def update_fn(combined_params, opt_state, example):\n",
+        "  freeze_params, train_params = combined_params\n",
+        "\n",
+        "  # The main thing is that we have to split up the params here so that JAX knows what to differentiate with respect to\n",
+        "  loss, grads = jax.value_and_grad(loss_fn)(train_params, freeze_params, example)\n",
+        "\n",
+        "  updates, opt_state = optimizer.update(grads, opt_state, params=train_params)\n",
+        "  new_train_params = optax.apply_updates(train_params, updates)\n",
+        "  return (freeze_params, new_train_params), opt_state, loss"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": null,
+      "metadata": {
+        "id": "cj2d1xIqFJw3"
+      },
+      "outputs": [],
+      "source": [
+        "bar = trange(50)\n",
+        "for epoch in bar:\n",
+        "  key, = jax.random.split(key, 1)\n",
+        "  permutation = jax.random.permutation(key, jnp.arange(len(dataset)))\n",
+        "  total_loss = 0\n",
+        "  for index in permutation:\n",
+        "    example = tokenized_data[index]\n",
+        "    combined_params, opt_state, loss = update_fn(combined_params, opt_state, example)\n",
+        "    total_loss += loss\n",
+        "  bar.set_description(f'Epoch {epoch} - Loss: {total_loss / len(tokenized_data):.3e}')"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "IMFZwE8qeSUl"
+      },
+      "source": [
+        "The trained LoRA parameters give us a model which predicts that Alan will love jaguars, and Grace will hate them:"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": null,
+      "metadata": {
+        "id": "GIgThnapFQS6"
+      },
+      "outputs": [],
+      "source": [
+        "for example in test_examples:\n",
+        "  make_prediction(combined_params, example)\n",
+        "  print()"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": null,
+      "metadata": {
+        "id": "92W8jCjQeZ9J"
+      },
+      "outputs": [],
+      "source": []
+    }
+  ],
+  "metadata": {
+    "accelerator": "GPU",
+    "colab": {
+      "collapsed_sections": [
+        "0Y6JeyF45yd_"
+      ],
+      "gpuType": "T4",
+      "provenance": []
+    },
+    "kernelspec": {
+      "display_name": "Python 3",
+      "name": "python3"
+    },
+    "language_info": {
+      "codemirror_mode": {
+        "name": "ipython",
+        "version": 3
+      },
+      "file_extension": ".py",
+      "mimetype": "text/x-python",
+      "name": "python",
+      "nbconvert_exporter": "python",
+      "pygments_lexer": "ipython3",
+      "version": "3.10.10"
+    },
+    "widgets": {
+      "application/vnd.jupyter.widget-state+json": {
+        "1feaed28d930404d8684eb14f7f363c7": {
+          "model_module": "@jupyter-widgets/controls",
+          "model_name": "HBoxModel",
+          "model_module_version": "1.5.0",
+          "state": {
+            "_dom_classes": [],
+            "_model_module": "@jupyter-widgets/controls",
+            "_model_module_version": "1.5.0",
+            "_model_name": "HBoxModel",
+            "_view_count": null,
+            "_view_module": "@jupyter-widgets/controls",
+            "_view_module_version": "1.5.0",
+            "_view_name": "HBoxView",
+            "box_style": "",
+            "children": [
+              "IPY_MODEL_3b045a0e3ecf4c098d8fa8b0917108a5",
+              "IPY_MODEL_ab35fcc175334e089b9abf95d91604c3",
+              "IPY_MODEL_bc4dcd32917f4970843bdde97d849285"
+            ],
+            "layout": "IPY_MODEL_ba84ab17fb18462a82cd6ddd9581d8de"
+          }
+        },
+        "3b045a0e3ecf4c098d8fa8b0917108a5": {
+          "model_module": "@jupyter-widgets/controls",
+          "model_name": "HTMLModel",
+          "model_module_version": "1.5.0",
+          "state": {
+            "_dom_classes": [],
+            "_model_module": "@jupyter-widgets/controls",
+            "_model_module_version": "1.5.0",
+            "_model_name": "HTMLModel",
+            "_view_count": null,
+            "_view_module": "@jupyter-widgets/controls",
+            "_view_module_version": "1.5.0",
+            "_view_name": "HTMLView",
+            "description": "",
+            "description_tooltip": null,
+            "layout": "IPY_MODEL_479284c4dab249d5b88b9f471e988392",
+            "placeholder": "​",
+            "style": "IPY_MODEL_98ba61d9a09b42629f5985a4b82041c5",
+            "value": "Downloading (…)lve/main/config.json: 100%"
+          }
+        },
+        "ab35fcc175334e089b9abf95d91604c3": {
+          "model_module": "@jupyter-widgets/controls",
+          "model_name": "FloatProgressModel",
+          "model_module_version": "1.5.0",
+          "state": {
+            "_dom_classes": [],
+            "_model_module": "@jupyter-widgets/controls",
+            "_model_module_version": "1.5.0",
+            "_model_name": "FloatProgressModel",
+            "_view_count": null,
+            "_view_module": "@jupyter-widgets/controls",
+            "_view_module_version": "1.5.0",
+            "_view_name": "ProgressView",
+            "bar_style": "success",
+            "description": "",
+            "description_tooltip": null,
+            "layout": "IPY_MODEL_0438654193164051b8eaf4e234299bf2",
+            "max": 896,
+            "min": 0,
+            "orientation": "horizontal",
+            "style": "IPY_MODEL_dd336b937eff4790a8b7ed0b5f8d211f",
+            "value": 896
+          }
+        },
+        "bc4dcd32917f4970843bdde97d849285": {
+          "model_module": "@jupyter-widgets/controls",
+          "model_name": "HTMLModel",
+          "model_module_version": "1.5.0",
+          "state": {
+            "_dom_classes": [],
+            "_model_module": "@jupyter-widgets/controls",
+            "_model_module_version": "1.5.0",
+            "_model_name": "HTMLModel",
+            "_view_count": null,
+            "_view_module": "@jupyter-widgets/controls",
+            "_view_module_version": "1.5.0",
+            "_view_name": "HTMLView",
+            "description": "",
+            "description_tooltip": null,
+            "layout": "IPY_MODEL_d86b4cb7e789448a93d78e597a9c45eb",
+            "placeholder": "​",
+            "style": "IPY_MODEL_3fbe401e5b084e66b567e6a597183f01",
+            "value": " 896/896 [00:00&lt;00:00, 27.9kB/s]"
+          }
+        },
+        "ba84ab17fb18462a82cd6ddd9581d8de": {
+          "model_module": "@jupyter-widgets/base",
+          "model_name": "LayoutModel",
+          "model_module_version": "1.2.0",
+          "state": {
+            "_model_module": "@jupyter-widgets/base",
+            "_model_module_version": "1.2.0",
+            "_model_name": "LayoutModel",
+            "_view_count": null,
+            "_view_module": "@jupyter-widgets/base",
+            "_view_module_version": "1.2.0",
+            "_view_name": "LayoutView",
+            "align_content": null,
+            "align_items": null,
+            "align_self": null,
+            "border": null,
+            "bottom": null,
+            "display": null,
+            "flex": null,
+            "flex_flow": null,
+            "grid_area": null,
+            "grid_auto_columns": null,
+            "grid_auto_flow": null,
+            "grid_auto_rows": null,
+            "grid_column": null,
+            "grid_gap": null,
+            "grid_row": null,
+            "grid_template_areas": null,
+            "grid_template_columns": null,
+            "grid_template_rows": null,
+            "height": null,
+            "justify_content": null,
+            "justify_items": null,
+            "left": null,
+            "margin": null,
+            "max_height": null,
+            "max_width": null,
+            "min_height": null,
+            "min_width": null,
+            "object_fit": null,
+            "object_position": null,
+            "order": null,
+            "overflow": null,
+            "overflow_x": null,
+            "overflow_y": null,
+            "padding": null,
+            "right": null,
+            "top": null,
+            "visibility": null,
+            "width": null
+          }
+        },
+        "479284c4dab249d5b88b9f471e988392": {
+          "model_module": "@jupyter-widgets/base",
+          "model_name": "LayoutModel",
+          "model_module_version": "1.2.0",
+          "state": {
+            "_model_module": "@jupyter-widgets/base",
+            "_model_module_version": "1.2.0",
+            "_model_name": "LayoutModel",
+            "_view_count": null,
+            "_view_module": "@jupyter-widgets/base",
+            "_view_module_version": "1.2.0",
+            "_view_name": "LayoutView",
+            "align_content": null,
+            "align_items": null,
+            "align_self": null,
+            "border": null,
+            "bottom": null,
+            "display": null,
+            "flex": null,
+            "flex_flow": null,
+            "grid_area": null,
+            "grid_auto_columns": null,
+            "grid_auto_flow": null,
+            "grid_auto_rows": null,
+            "grid_column": null,
+            "grid_gap": null,
+            "grid_row": null,
+            "grid_template_areas": null,
+            "grid_template_columns": null,
+            "grid_template_rows": null,
+            "height": null,
+            "justify_content": null,
+            "justify_items": null,
+            "left": null,
+            "margin": null,
+            "max_height": null,
+            "max_width": null,
+            "min_height": null,
+            "min_width": null,
+            "object_fit": null,
+            "object_position": null,
+            "order": null,
+            "overflow": null,
+            "overflow_x": null,
+            "overflow_y": null,
+            "padding": null,
+            "right": null,
+            "top": null,
+            "visibility": null,
+            "width": null
+          }
+        },
+        "98ba61d9a09b42629f5985a4b82041c5": {
+          "model_module": "@jupyter-widgets/controls",
+          "model_name": "DescriptionStyleModel",
+          "model_module_version": "1.5.0",
+          "state": {
+            "_model_module": "@jupyter-widgets/controls",
+            "_model_module_version": "1.5.0",
+            "_model_name": "DescriptionStyleModel",
+            "_view_count": null,
+            "_view_module": "@jupyter-widgets/base",
+            "_view_module_version": "1.2.0",
+            "_view_name": "StyleView",
+            "description_width": ""
+          }
+        },
+        "0438654193164051b8eaf4e234299bf2": {
+          "model_module": "@jupyter-widgets/base",
+          "model_name": "LayoutModel",
+          "model_module_version": "1.2.0",
+          "state": {
+            "_model_module": "@jupyter-widgets/base",
+            "_model_module_version": "1.2.0",
+            "_model_name": "LayoutModel",
+            "_view_count": null,
+            "_view_module": "@jupyter-widgets/base",
+            "_view_module_version": "1.2.0",
+            "_view_name": "LayoutView",
+            "align_content": null,
+            "align_items": null,
+            "align_self": null,
+            "border": null,
+            "bottom": null,
+            "display": null,
+            "flex": null,
+            "flex_flow": null,
+            "grid_area": null,
+            "grid_auto_columns": null,
+            "grid_auto_flow": null,
+            "grid_auto_rows": null,
+            "grid_column": null,
+            "grid_gap": null,
+            "grid_row": null,
+            "grid_template_areas": null,
+            "grid_template_columns": null,
+            "grid_template_rows": null,
+            "height": null,
+            "justify_content": null,
+            "justify_items": null,
+            "left": null,
+            "margin": null,
+            "max_height": null,
+            "max_width": null,
+            "min_height": null,
+            "min_width": null,
+            "object_fit": null,
+            "object_position": null,
+            "order": null,
+            "overflow": null,
+            "overflow_x": null,
+            "overflow_y": null,
+            "padding": null,
+            "right": null,
+            "top": null,
+            "visibility": null,
+            "width": null
+          }
+        },
+        "dd336b937eff4790a8b7ed0b5f8d211f": {
+          "model_module": "@jupyter-widgets/controls",
+          "model_name": "ProgressStyleModel",
+          "model_module_version": "1.5.0",
+          "state": {
+            "_model_module": "@jupyter-widgets/controls",
+            "_model_module_version": "1.5.0",
+            "_model_name": "ProgressStyleModel",
+            "_view_count": null,
+            "_view_module": "@jupyter-widgets/base",
+            "_view_module_version": "1.2.0",
+            "_view_name": "StyleView",
+            "bar_color": null,
+            "description_width": ""
+          }
+        },
+        "d86b4cb7e789448a93d78e597a9c45eb": {
+          "model_module": "@jupyter-widgets/base",
+          "model_name": "LayoutModel",
+          "model_module_version": "1.2.0",
+          "state": {
+            "_model_module": "@jupyter-widgets/base",
+            "_model_module_version": "1.2.0",
+            "_model_name": "LayoutModel",
+            "_view_count": null,
+            "_view_module": "@jupyter-widgets/base",
+            "_view_module_version": "1.2.0",
+            "_view_name": "LayoutView",
+            "align_content": null,
+            "align_items": null,
+            "align_self": null,
+            "border": null,
+            "bottom": null,
+            "display": null,
+            "flex": null,
+            "flex_flow": null,
+            "grid_area": null,
+            "grid_auto_columns": null,
+            "grid_auto_flow": null,
+            "grid_auto_rows": null,
+            "grid_column": null,
+            "grid_gap": null,
+            "grid_row": null,
+            "grid_template_areas": null,
+            "grid_template_columns": null,
+            "grid_template_rows": null,
+            "height": null,
+            "justify_content": null,
+            "justify_items": null,
+            "left": null,
+            "margin": null,
+            "max_height": null,
+            "max_width": null,
+            "min_height": null,
+            "min_width": null,
+            "object_fit": null,
+            "object_position": null,
+            "order": null,
+            "overflow": null,
+            "overflow_x": null,
+            "overflow_y": null,
+            "padding": null,
+            "right": null,
+            "top": null,
+            "visibility": null,
+            "width": null
+          }
+        },
+        "3fbe401e5b084e66b567e6a597183f01": {
+          "model_module": "@jupyter-widgets/controls",
+          "model_name": "DescriptionStyleModel",
+          "model_module_version": "1.5.0",
+          "state": {
+            "_model_module": "@jupyter-widgets/controls",
+            "_model_module_version": "1.5.0",
+            "_model_name": "DescriptionStyleModel",
+            "_view_count": null,
+            "_view_module": "@jupyter-widgets/base",
+            "_view_module_version": "1.2.0",
+            "_view_name": "StyleView",
+            "description_width": ""
+          }
+        },
+        "bd19569f52df4784ac7dbdfd159f5588": {
+          "model_module": "@jupyter-widgets/controls",
+          "model_name": "HBoxModel",
+          "model_module_version": "1.5.0",
+          "state": {
+            "_dom_classes": [],
+            "_model_module": "@jupyter-widgets/controls",
+            "_model_module_version": "1.5.0",
+            "_model_name": "HBoxModel",
+            "_view_count": null,
+            "_view_module": "@jupyter-widgets/controls",
+            "_view_module_version": "1.5.0",
+            "_view_name": "HBoxView",
+            "box_style": "",
+            "children": [
+              "IPY_MODEL_1314b9257b2047769a7c95408ddaddc3",
+              "IPY_MODEL_196c4ac94eab47048c7de169ce8e12f1",
+              "IPY_MODEL_9c40c0629bac4d3884a991153bb61a15"
+            ],
+            "layout": "IPY_MODEL_44411fd1e3254d97b05e3b536c5b4f3b"
+          }
+        },
+        "1314b9257b2047769a7c95408ddaddc3": {
+          "model_module": "@jupyter-widgets/controls",
+          "model_name": "HTMLModel",
+          "model_module_version": "1.5.0",
+          "state": {
+            "_dom_classes": [],
+            "_model_module": "@jupyter-widgets/controls",
+            "_model_module_version": "1.5.0",
+            "_model_name": "HTMLModel",
+            "_view_count": null,
+            "_view_module": "@jupyter-widgets/controls",
+            "_view_module_version": "1.5.0",
+            "_view_name": "HTMLView",
+            "description": "",
+            "description_tooltip": null,
+            "layout": "IPY_MODEL_272afa7b903244e297c4904b19c01632",
+            "placeholder": "​",
+            "style": "IPY_MODEL_f6ef494723ad4b078cda848ec84bf621",
+            "value": "Downloading (…)model.bin.index.json: 100%"
+          }
+        },
+        "196c4ac94eab47048c7de169ce8e12f1": {
+          "model_module": "@jupyter-widgets/controls",
+          "model_name": "FloatProgressModel",
+          "model_module_version": "1.5.0",
+          "state": {
+            "_dom_classes": [],
+            "_model_module": "@jupyter-widgets/controls",
+            "_model_module_version": "1.5.0",
+            "_model_name": "FloatProgressModel",
+            "_view_count": null,
+            "_view_module": "@jupyter-widgets/controls",
+            "_view_module_version": "1.5.0",
+            "_view_name": "ProgressView",
+            "bar_style": "success",
+            "description": "",
+            "description_tooltip": null,
+            "layout": "IPY_MODEL_646e4a2d4ed34cd2ab73cb598a011075",
+            "max": 55432,
+            "min": 0,
+            "orientation": "horizontal",
+            "style": "IPY_MODEL_b4a84db21aa44c77bdbddbaabc3fc079",
+            "value": 55432
+          }
+        },
+        "9c40c0629bac4d3884a991153bb61a15": {
+          "model_module": "@jupyter-widgets/controls",
+          "model_name": "HTMLModel",
+          "model_module_version": "1.5.0",
+          "state": {
+            "_dom_classes": [],
+            "_model_module": "@jupyter-widgets/controls",
+            "_model_module_version": "1.5.0",
+            "_model_name": "HTMLModel",
+            "_view_count": null,
+            "_view_module": "@jupyter-widgets/controls",
+            "_view_module_version": "1.5.0",
+            "_view_name": "HTMLView",
+            "description": "",
+            "description_tooltip": null,
+            "layout": "IPY_MODEL_3313b927bd0a4546baf4ce6f6ebd5bcf",
+            "placeholder": "​",
+            "style": "IPY_MODEL_9b2914d7649a447a9e059d7a65600e44",
+            "value": " 55.4k/55.4k [00:00&lt;00:00, 1.38MB/s]"
+          }
+        },
+        "44411fd1e3254d97b05e3b536c5b4f3b": {
+          "model_module": "@jupyter-widgets/base",
+          "model_name": "LayoutModel",
+          "model_module_version": "1.2.0",
+          "state": {
+            "_model_module": "@jupyter-widgets/base",
+            "_model_module_version": "1.2.0",
+            "_model_name": "LayoutModel",
+            "_view_count": null,
+            "_view_module": "@jupyter-widgets/base",
+            "_view_module_version": "1.2.0",
+            "_view_name": "LayoutView",
+            "align_content": null,
+            "align_items": null,
+            "align_self": null,
+            "border": null,
+            "bottom": null,
+            "display": null,
+            "flex": null,
+            "flex_flow": null,
+            "grid_area": null,
+            "grid_auto_columns": null,
+            "grid_auto_flow": null,
+            "grid_auto_rows": null,
+            "grid_column": null,
+            "grid_gap": null,
+            "grid_row": null,
+            "grid_template_areas": null,
+            "grid_template_columns": null,
+            "grid_template_rows": null,
+            "height": null,
+            "justify_content": null,
+            "justify_items": null,
+            "left": null,
+            "margin": null,
+            "max_height": null,
+            "max_width": null,
+            "min_height": null,
+            "min_width": null,
+            "object_fit": null,
+            "object_position": null,
+            "order": null,
+            "overflow": null,
+            "overflow_x": null,
+            "overflow_y": null,
+            "padding": null,
+            "right": null,
+            "top": null,
+            "visibility": null,
+            "width": null
+          }
+        },
+        "272afa7b903244e297c4904b19c01632": {
+          "model_module": "@jupyter-widgets/base",
+          "model_name": "LayoutModel",
+          "model_module_version": "1.2.0",
+          "state": {
+            "_model_module": "@jupyter-widgets/base",
+            "_model_module_version": "1.2.0",
+            "_model_name": "LayoutModel",
+            "_view_count": null,
+            "_view_module": "@jupyter-widgets/base",
+            "_view_module_version": "1.2.0",
+            "_view_name": "LayoutView",
+            "align_content": null,
+            "align_items": null,
+            "align_self": null,
+            "border": null,
+            "bottom": null,
+            "display": null,
+            "flex": null,
+            "flex_flow": null,
+            "grid_area": null,
+            "grid_auto_columns": null,
+            "grid_auto_flow": null,
+            "grid_auto_rows": null,
+            "grid_column": null,
+            "grid_gap": null,
+            "grid_row": null,
+            "grid_template_areas": null,
+            "grid_template_columns": null,
+            "grid_template_rows": null,
+            "height": null,
+            "justify_content": null,
+            "justify_items": null,
+            "left": null,
+            "margin": null,
+            "max_height": null,
+            "max_width": null,
+            "min_height": null,
+            "min_width": null,
+            "object_fit": null,
+            "object_position": null,
+            "order": null,
+            "overflow": null,
+            "overflow_x": null,
+            "overflow_y": null,
+            "padding": null,
+            "right": null,
+            "top": null,
+            "visibility": null,
+            "width": null
+          }
+        },
+        "f6ef494723ad4b078cda848ec84bf621": {
+          "model_module": "@jupyter-widgets/controls",
+          "model_name": "DescriptionStyleModel",
+          "model_module_version": "1.5.0",
+          "state": {
+            "_model_module": "@jupyter-widgets/controls",
+            "_model_module_version": "1.5.0",
+            "_model_name": "DescriptionStyleModel",
+            "_view_count": null,
+            "_view_module": "@jupyter-widgets/base",
+            "_view_module_version": "1.2.0",
+            "_view_name": "StyleView",
+            "description_width": ""
+          }
+        },
+        "646e4a2d4ed34cd2ab73cb598a011075": {
+          "model_module": "@jupyter-widgets/base",
+          "model_name": "LayoutModel",
+          "model_module_version": "1.2.0",
+          "state": {
+            "_model_module": "@jupyter-widgets/base",
+            "_model_module_version": "1.2.0",
+            "_model_name": "LayoutModel",
+            "_view_count": null,
+            "_view_module": "@jupyter-widgets/base",
+            "_view_module_version": "1.2.0",
+            "_view_name": "LayoutView",
+            "align_content": null,
+            "align_items": null,
+            "align_self": null,
+            "border": null,
+            "bottom": null,
+            "display": null,
+            "flex": null,
+            "flex_flow": null,
+            "grid_area": null,
+            "grid_auto_columns": null,
+            "grid_auto_flow": null,
+            "grid_auto_rows": null,
+            "grid_column": null,
+            "grid_gap": null,
+            "grid_row": null,
+            "grid_template_areas": null,
+            "grid_template_columns": null,
+            "grid_template_rows": null,
+            "height": null,
+            "justify_content": null,
+            "justify_items": null,
+            "left": null,
+            "margin": null,
+            "max_height": null,
+            "max_width": null,
+            "min_height": null,
+            "min_width": null,
+            "object_fit": null,
+            "object_position": null,
+            "order": null,
+            "overflow": null,
+            "overflow_x": null,
+            "overflow_y": null,
+            "padding": null,
+            "right": null,
+            "top": null,
+            "visibility": null,
+            "width": null
+          }
+        },
+        "b4a84db21aa44c77bdbddbaabc3fc079": {
+          "model_module": "@jupyter-widgets/controls",
+          "model_name": "ProgressStyleModel",
+          "model_module_version": "1.5.0",
+          "state": {
+            "_model_module": "@jupyter-widgets/controls",
+            "_model_module_version": "1.5.0",
+            "_model_name": "ProgressStyleModel",
+            "_view_count": null,
+            "_view_module": "@jupyter-widgets/base",
+            "_view_module_version": "1.2.0",
+            "_view_name": "StyleView",
+            "bar_color": null,
+            "description_width": ""
+          }
+        },
+        "3313b927bd0a4546baf4ce6f6ebd5bcf": {
+          "model_module": "@jupyter-widgets/base",
+          "model_name": "LayoutModel",
+          "model_module_version": "1.2.0",
+          "state": {
+            "_model_module": "@jupyter-widgets/base",
+            "_model_module_version": "1.2.0",
+            "_model_name": "LayoutModel",
+            "_view_count": null,
+            "_view_module": "@jupyter-widgets/base",
+            "_view_module_version": "1.2.0",
+            "_view_name": "LayoutView",
+            "align_content": null,
+            "align_items": null,
+            "align_self": null,
+            "border": null,
+            "bottom": null,
+            "display": null,
+            "flex": null,
+            "flex_flow": null,
+            "grid_area": null,
+            "grid_auto_columns": null,
+            "grid_auto_flow": null,
+            "grid_auto_rows": null,
+            "grid_column": null,
+            "grid_gap": null,
+            "grid_row": null,
+            "grid_template_areas": null,
+            "grid_template_columns": null,
+            "grid_template_rows": null,
+            "height": null,
+            "justify_content": null,
+            "justify_items": null,
+            "left": null,
+            "margin": null,
+            "max_height": null,
+            "max_width": null,
+            "min_height": null,
+            "min_width": null,
+            "object_fit": null,
+            "object_position": null,
+            "order": null,
+            "overflow": null,
+            "overflow_x": null,
+            "overflow_y": null,
+            "padding": null,
+            "right": null,
+            "top": null,
+            "visibility": null,
+            "width": null
+          }
+        },
+        "9b2914d7649a447a9e059d7a65600e44": {
+          "model_module": "@jupyter-widgets/controls",
+          "model_name": "DescriptionStyleModel",
+          "model_module_version": "1.5.0",
+          "state": {
+            "_model_module": "@jupyter-widgets/controls",
+            "_model_module_version": "1.5.0",
+            "_model_name": "DescriptionStyleModel",
+            "_view_count": null,
+            "_view_module": "@jupyter-widgets/base",
+            "_view_module_version": "1.2.0",
+            "_view_name": "StyleView",
+            "description_width": ""
+          }
+        },
+        "1169578ce65440f695f8ec234c84a90d": {
+          "model_module": "@jupyter-widgets/controls",
+          "model_name": "HBoxModel",
+          "model_module_version": "1.5.0",
+          "state": {
+            "_dom_classes": [],
+            "_model_module": "@jupyter-widgets/controls",
+            "_model_module_version": "1.5.0",
+            "_model_name": "HBoxModel",
+            "_view_count": null,
+            "_view_module": "@jupyter-widgets/controls",
+            "_view_module_version": "1.5.0",
+            "_view_name": "HBoxView",
+            "box_style": "",
+            "children": [
+              "IPY_MODEL_a835304240c54c5db280426c5fb856c4",
+              "IPY_MODEL_cd42bbbaf3eb4c4fb1ade3c314b2eaac",
+              "IPY_MODEL_ebdcacb5d2094c878af40538fb2ec42c"
+            ],
+            "layout": "IPY_MODEL_3ead271f91ea4af3a64928754048020f"
+          }
+        },
+        "a835304240c54c5db280426c5fb856c4": {
+          "model_module": "@jupyter-widgets/controls",
+          "model_name": "HTMLModel",
+          "model_module_version": "1.5.0",
+          "state": {
+            "_dom_classes": [],
+            "_model_module": "@jupyter-widgets/controls",
+            "_model_module_version": "1.5.0",
+            "_model_name": "HTMLModel",
+            "_view_count": null,
+            "_view_module": "@jupyter-widgets/controls",
+            "_view_module_version": "1.5.0",
+            "_view_name": "HTMLView",
+            "description": "",
+            "description_tooltip": null,
+            "layout": "IPY_MODEL_93e3146c97c4430a84524f9fc804a219",
+            "placeholder": "​",
+            "style": "IPY_MODEL_0708839b96b1495a9e2f15f0d1b78574",
+            "value": "Downloading shards: 100%"
+          }
+        },
+        "cd42bbbaf3eb4c4fb1ade3c314b2eaac": {
+          "model_module": "@jupyter-widgets/controls",
+          "model_name": "FloatProgressModel",
+          "model_module_version": "1.5.0",
+          "state": {
+            "_dom_classes": [],
+            "_model_module": "@jupyter-widgets/controls",
+            "_model_module_version": "1.5.0",
+            "_model_name": "FloatProgressModel",
+            "_view_count": null,
+            "_view_module": "@jupyter-widgets/controls",
+            "_view_module_version": "1.5.0",
+            "_view_name": "ProgressView",
+            "bar_style": "success",
+            "description": "",
+            "description_tooltip": null,
+            "layout": "IPY_MODEL_fd56b1d95eb04339b9b01ba691db4e81",
+            "max": 2,
+            "min": 0,
+            "orientation": "horizontal",
+            "style": "IPY_MODEL_701570dcfc2a4b8d99b2ea9c08057fe7",
+            "value": 2
+          }
+        },
+        "ebdcacb5d2094c878af40538fb2ec42c": {
+          "model_module": "@jupyter-widgets/controls",
+          "model_name": "HTMLModel",
+          "model_module_version": "1.5.0",
+          "state": {
+            "_dom_classes": [],
+            "_model_module": "@jupyter-widgets/controls",
+            "_model_module_version": "1.5.0",
+            "_model_name": "HTMLModel",
+            "_view_count": null,
+            "_view_module": "@jupyter-widgets/controls",
+            "_view_module_version": "1.5.0",
+            "_view_name": "HTMLView",
+            "description": "",
+            "description_tooltip": null,
+            "layout": "IPY_MODEL_521d7df8efe849c0a43b7b2f2b10ba24",
+            "placeholder": "​",
+            "style": "IPY_MODEL_0da6f8caf9b34844bf763f7030c5e504",
+            "value": " 2/2 [01:32&lt;00:00, 40.63s/it]"
+          }
+        },
+        "3ead271f91ea4af3a64928754048020f": {
+          "model_module": "@jupyter-widgets/base",
+          "model_name": "LayoutModel",
+          "model_module_version": "1.2.0",
+          "state": {
+            "_model_module": "@jupyter-widgets/base",
+            "_model_module_version": "1.2.0",
+            "_model_name": "LayoutModel",
+            "_view_count": null,
+            "_view_module": "@jupyter-widgets/base",
+            "_view_module_version": "1.2.0",
+            "_view_name": "LayoutView",
+            "align_content": null,
+            "align_items": null,
+            "align_self": null,
+            "border": null,
+            "bottom": null,
+            "display": null,
+            "flex": null,
+            "flex_flow": null,
+            "grid_area": null,
+            "grid_auto_columns": null,
+            "grid_auto_flow": null,
+            "grid_auto_rows": null,
+            "grid_column": null,
+            "grid_gap": null,
+            "grid_row": null,
+            "grid_template_areas": null,
+            "grid_template_columns": null,
+            "grid_template_rows": null,
+            "height": null,
+            "justify_content": null,
+            "justify_items": null,
+            "left": null,
+            "margin": null,
+            "max_height": null,
+            "max_width": null,
+            "min_height": null,
+            "min_width": null,
+            "object_fit": null,
+            "object_position": null,
+            "order": null,
+            "overflow": null,
+            "overflow_x": null,
+            "overflow_y": null,
+            "padding": null,
+            "right": null,
+            "top": null,
+            "visibility": null,
+            "width": null
+          }
+        },
+        "93e3146c97c4430a84524f9fc804a219": {
+          "model_module": "@jupyter-widgets/base",
+          "model_name": "LayoutModel",
+          "model_module_version": "1.2.0",
+          "state": {
+            "_model_module": "@jupyter-widgets/base",
+            "_model_module_version": "1.2.0",
+            "_model_name": "LayoutModel",
+            "_view_count": null,
+            "_view_module": "@jupyter-widgets/base",
+            "_view_module_version": "1.2.0",
+            "_view_name": "LayoutView",
+            "align_content": null,
+            "align_items": null,
+            "align_self": null,
+            "border": null,
+            "bottom": null,
+            "display": null,
+            "flex": null,
+            "flex_flow": null,
+            "grid_area": null,
+            "grid_auto_columns": null,
+            "grid_auto_flow": null,
+            "grid_auto_rows": null,
+            "grid_column": null,
+            "grid_gap": null,
+            "grid_row": null,
+            "grid_template_areas": null,
+            "grid_template_columns": null,
+            "grid_template_rows": null,
+            "height": null,
+            "justify_content": null,
+            "justify_items": null,
+            "left": null,
+            "margin": null,
+            "max_height": null,
+            "max_width": null,
+            "min_height": null,
+            "min_width": null,
+            "object_fit": null,
+            "object_position": null,
+            "order": null,
+            "overflow": null,
+            "overflow_x": null,
+            "overflow_y": null,
+            "padding": null,
+            "right": null,
+            "top": null,
+            "visibility": null,
+            "width": null
+          }
+        },
+        "0708839b96b1495a9e2f15f0d1b78574": {
+          "model_module": "@jupyter-widgets/controls",
+          "model_name": "DescriptionStyleModel",
+          "model_module_version": "1.5.0",
+          "state": {
+            "_model_module": "@jupyter-widgets/controls",
+            "_model_module_version": "1.5.0",
+            "_model_name": "DescriptionStyleModel",
+            "_view_count": null,
+            "_view_module": "@jupyter-widgets/base",
+            "_view_module_version": "1.2.0",
+            "_view_name": "StyleView",
+            "description_width": ""
+          }
+        },
+        "fd56b1d95eb04339b9b01ba691db4e81": {
+          "model_module": "@jupyter-widgets/base",
+          "model_name": "LayoutModel",
+          "model_module_version": "1.2.0",
+          "state": {
+            "_model_module": "@jupyter-widgets/base",
+            "_model_module_version": "1.2.0",
+            "_model_name": "LayoutModel",
+            "_view_count": null,
+            "_view_module": "@jupyter-widgets/base",
+            "_view_module_version": "1.2.0",
+            "_view_name": "LayoutView",
+            "align_content": null,
+            "align_items": null,
+            "align_self": null,
+            "border": null,
+            "bottom": null,
+            "display": null,
+            "flex": null,
+            "flex_flow": null,
+            "grid_area": null,
+            "grid_auto_columns": null,
+            "grid_auto_flow": null,
+            "grid_auto_rows": null,
+            "grid_column": null,
+            "grid_gap": null,
+            "grid_row": null,
+            "grid_template_areas": null,
+            "grid_template_columns": null,
+            "grid_template_rows": null,
+            "height": null,
+            "justify_content": null,
+            "justify_items": null,
+            "left": null,
+            "margin": null,
+            "max_height": null,
+            "max_width": null,
+            "min_height": null,
+            "min_width": null,
+            "object_fit": null,
+            "object_position": null,
+            "order": null,
+            "overflow": null,
+            "overflow_x": null,
+            "overflow_y": null,
+            "padding": null,
+            "right": null,
+            "top": null,
+            "visibility": null,
+            "width": null
+          }
+        },
+        "701570dcfc2a4b8d99b2ea9c08057fe7": {
+          "model_module": "@jupyter-widgets/controls",
+          "model_name": "ProgressStyleModel",
+          "model_module_version": "1.5.0",
+          "state": {
+            "_model_module": "@jupyter-widgets/controls",
+            "_model_module_version": "1.5.0",
+            "_model_name": "ProgressStyleModel",
+            "_view_count": null,
+            "_view_module": "@jupyter-widgets/base",
+            "_view_module_version": "1.2.0",
+            "_view_name": "StyleView",
+            "bar_color": null,
+            "description_width": ""
+          }
+        },
+        "521d7df8efe849c0a43b7b2f2b10ba24": {
+          "model_module": "@jupyter-widgets/base",
+          "model_name": "LayoutModel",
+          "model_module_version": "1.2.0",
+          "state": {
+            "_model_module": "@jupyter-widgets/base",
+            "_model_module_version": "1.2.0",
+            "_model_name": "LayoutModel",
+            "_view_count": null,
+            "_view_module": "@jupyter-widgets/base",
+            "_view_module_version": "1.2.0",
+            "_view_name": "LayoutView",
+            "align_content": null,
+            "align_items": null,
+            "align_self": null,
+            "border": null,
+            "bottom": null,
+            "display": null,
+            "flex": null,
+            "flex_flow": null,
+            "grid_area": null,
+            "grid_auto_columns": null,
+            "grid_auto_flow": null,
+            "grid_auto_rows": null,
+            "grid_column": null,
+            "grid_gap": null,
+            "grid_row": null,
+            "grid_template_areas": null,
+            "grid_template_columns": null,
+            "grid_template_rows": null,
+            "height": null,
+            "justify_content": null,
+            "justify_items": null,
+            "left": null,
+            "margin": null,
+            "max_height": null,
+            "max_width": null,
+            "min_height": null,
+            "min_width": null,
+            "object_fit": null,
+            "object_position": null,
+            "order": null,
+            "overflow": null,
+            "overflow_x": null,
+            "overflow_y": null,
+            "padding": null,
+            "right": null,
+            "top": null,
+            "visibility": null,
+            "width": null
+          }
+        },
+        "0da6f8caf9b34844bf763f7030c5e504": {
+          "model_module": "@jupyter-widgets/controls",
+          "model_name": "DescriptionStyleModel",
+          "model_module_version": "1.5.0",
+          "state": {
+            "_model_module": "@jupyter-widgets/controls",
+            "_model_module_version": "1.5.0",
+            "_model_name": "DescriptionStyleModel",
+            "_view_count": null,
+            "_view_module": "@jupyter-widgets/base",
+            "_view_module_version": "1.2.0",
+            "_view_name": "StyleView",
+            "description_width": ""
+          }
+        },
+        "f99effa921ce42e2b62ca92d18d6d8d0": {
+          "model_module": "@jupyter-widgets/controls",
+          "model_name": "HBoxModel",
+          "model_module_version": "1.5.0",
+          "state": {
+            "_dom_classes": [],
+            "_model_module": "@jupyter-widgets/controls",
+            "_model_module_version": "1.5.0",
+            "_model_name": "HBoxModel",
+            "_view_count": null,
+            "_view_module": "@jupyter-widgets/controls",
+            "_view_module_version": "1.5.0",
+            "_view_name": "HBoxView",
+            "box_style": "",
+            "children": [
+              "IPY_MODEL_33889cada1fd42a3933eb4f1e923d7a4",
+              "IPY_MODEL_0f30b08741a74eab83dd3689671f1269",
+              "IPY_MODEL_fd73f715ee7548979c57df22efda5097"
+            ],
+            "layout": "IPY_MODEL_e44cf747982d41bb862843133b58f4b1"
+          }
+        },
+        "33889cada1fd42a3933eb4f1e923d7a4": {
+          "model_module": "@jupyter-widgets/controls",
+          "model_name": "HTMLModel",
+          "model_module_version": "1.5.0",
+          "state": {
+            "_dom_classes": [],
+            "_model_module": "@jupyter-widgets/controls",
+            "_model_module_version": "1.5.0",
+            "_model_name": "HTMLModel",
+            "_view_count": null,
+            "_view_module": "@jupyter-widgets/controls",
+            "_view_module_version": "1.5.0",
+            "_view_name": "HTMLView",
+            "description": "",
+            "description_tooltip": null,
+            "layout": "IPY_MODEL_8c12d66a9488411986f8a0d16822ba87",
+            "placeholder": "​",
+            "style": "IPY_MODEL_df7a49e2d8b34643baf60560e1aa89fc",
+            "value": "Downloading (…)l-00001-of-00002.bin: 100%"
+          }
+        },
+        "0f30b08741a74eab83dd3689671f1269": {
+          "model_module": "@jupyter-widgets/controls",
+          "model_name": "FloatProgressModel",
+          "model_module_version": "1.5.0",
+          "state": {
+            "_dom_classes": [],
+            "_model_module": "@jupyter-widgets/controls",
+            "_model_module_version": "1.5.0",
+            "_model_name": "FloatProgressModel",
+            "_view_count": null,
+            "_view_module": "@jupyter-widgets/controls",
+            "_view_module_version": "1.5.0",
+            "_view_name": "ProgressView",
+            "bar_style": "success",
+            "description": "",
+            "description_tooltip": null,
+            "layout": "IPY_MODEL_7974a3139aca4dabbc2bc57844121070",
+            "max": 9449929179,
+            "min": 0,
+            "orientation": "horizontal",
+            "style": "IPY_MODEL_865b5e4cd3a2499da68eaaf5d47dfc95",
+            "value": 9449929179
+          }
+        },
+        "fd73f715ee7548979c57df22efda5097": {
+          "model_module": "@jupyter-widgets/controls",
+          "model_name": "HTMLModel",
+          "model_module_version": "1.5.0",
+          "state": {
+            "_dom_classes": [],
+            "_model_module": "@jupyter-widgets/controls",
+            "_model_module_version": "1.5.0",
+            "_model_name": "HTMLModel",
+            "_view_count": null,
+            "_view_module": "@jupyter-widgets/controls",
+            "_view_module_version": "1.5.0",
+            "_view_name": "HTMLView",
+            "description": "",
+            "description_tooltip": null,
+            "layout": "IPY_MODEL_6a96f4ad805142468ae84bf77ed41d3a",
+            "placeholder": "​",
+            "style": "IPY_MODEL_6a751bdf2f094739b4c3cba478625f84",
+            "value": " 9.45G/9.45G [01:17&lt;00:00, 173MB/s]"
+          }
+        },
+        "e44cf747982d41bb862843133b58f4b1": {
+          "model_module": "@jupyter-widgets/base",
+          "model_name": "LayoutModel",
+          "model_module_version": "1.2.0",
+          "state": {
+            "_model_module": "@jupyter-widgets/base",
+            "_model_module_version": "1.2.0",
+            "_model_name": "LayoutModel",
+            "_view_count": null,
+            "_view_module": "@jupyter-widgets/base",
+            "_view_module_version": "1.2.0",
+            "_view_name": "LayoutView",
+            "align_content": null,
+            "align_items": null,
+            "align_self": null,
+            "border": null,
+            "bottom": null,
+            "display": null,
+            "flex": null,
+            "flex_flow": null,
+            "grid_area": null,
+            "grid_auto_columns": null,
+            "grid_auto_flow": null,
+            "grid_auto_rows": null,
+            "grid_column": null,
+            "grid_gap": null,
+            "grid_row": null,
+            "grid_template_areas": null,
+            "grid_template_columns": null,
+            "grid_template_rows": null,
+            "height": null,
+            "justify_content": null,
+            "justify_items": null,
+            "left": null,
+            "margin": null,
+            "max_height": null,
+            "max_width": null,
+            "min_height": null,
+            "min_width": null,
+            "object_fit": null,
+            "object_position": null,
+            "order": null,
+            "overflow": null,
+            "overflow_x": null,
+            "overflow_y": null,
+            "padding": null,
+            "right": null,
+            "top": null,
+            "visibility": null,
+            "width": null
+          }
+        },
+        "8c12d66a9488411986f8a0d16822ba87": {
+          "model_module": "@jupyter-widgets/base",
+          "model_name": "LayoutModel",
+          "model_module_version": "1.2.0",
+          "state": {
+            "_model_module": "@jupyter-widgets/base",
+            "_model_module_version": "1.2.0",
+            "_model_name": "LayoutModel",
+            "_view_count": null,
+            "_view_module": "@jupyter-widgets/base",
+            "_view_module_version": "1.2.0",
+            "_view_name": "LayoutView",
+            "align_content": null,
+            "align_items": null,
+            "align_self": null,
+            "border": null,
+            "bottom": null,
+            "display": null,
+            "flex": null,
+            "flex_flow": null,
+            "grid_area": null,
+            "grid_auto_columns": null,
+            "grid_auto_flow": null,
+            "grid_auto_rows": null,
+            "grid_column": null,
+            "grid_gap": null,
+            "grid_row": null,
+            "grid_template_areas": null,
+            "grid_template_columns": null,
+            "grid_template_rows": null,
+            "height": null,
+            "justify_content": null,
+            "justify_items": null,
+            "left": null,
+            "margin": null,
+            "max_height": null,
+            "max_width": null,
+            "min_height": null,
+            "min_width": null,
+            "object_fit": null,
+            "object_position": null,
+            "order": null,
+            "overflow": null,
+            "overflow_x": null,
+            "overflow_y": null,
+            "padding": null,
+            "right": null,
+            "top": null,
+            "visibility": null,
+            "width": null
+          }
+        },
+        "df7a49e2d8b34643baf60560e1aa89fc": {
+          "model_module": "@jupyter-widgets/controls",
+          "model_name": "DescriptionStyleModel",
+          "model_module_version": "1.5.0",
+          "state": {
+            "_model_module": "@jupyter-widgets/controls",
+            "_model_module_version": "1.5.0",
+            "_model_name": "DescriptionStyleModel",
+            "_view_count": null,
+            "_view_module": "@jupyter-widgets/base",
+            "_view_module_version": "1.2.0",
+            "_view_name": "StyleView",
+            "description_width": ""
+          }
+        },
+        "7974a3139aca4dabbc2bc57844121070": {
+          "model_module": "@jupyter-widgets/base",
+          "model_name": "LayoutModel",
+          "model_module_version": "1.2.0",
+          "state": {
+            "_model_module": "@jupyter-widgets/base",
+            "_model_module_version": "1.2.0",
+            "_model_name": "LayoutModel",
+            "_view_count": null,
+            "_view_module": "@jupyter-widgets/base",
+            "_view_module_version": "1.2.0",
+            "_view_name": "LayoutView",
+            "align_content": null,
+            "align_items": null,
+            "align_self": null,
+            "border": null,
+            "bottom": null,
+            "display": null,
+            "flex": null,
+            "flex_flow": null,
+            "grid_area": null,
+            "grid_auto_columns": null,
+            "grid_auto_flow": null,
+            "grid_auto_rows": null,
+            "grid_column": null,
+            "grid_gap": null,
+            "grid_row": null,
+            "grid_template_areas": null,
+            "grid_template_columns": null,
+            "grid_template_rows": null,
+            "height": null,
+            "justify_content": null,
+            "justify_items": null,
+            "left": null,
+            "margin": null,
+            "max_height": null,
+            "max_width": null,
+            "min_height": null,
+            "min_width": null,
+            "object_fit": null,
+            "object_position": null,
+            "order": null,
+            "overflow": null,
+            "overflow_x": null,
+            "overflow_y": null,
+            "padding": null,
+            "right": null,
+            "top": null,
+            "visibility": null,
+            "width": null
+          }
+        },
+        "865b5e4cd3a2499da68eaaf5d47dfc95": {
+          "model_module": "@jupyter-widgets/controls",
+          "model_name": "ProgressStyleModel",
+          "model_module_version": "1.5.0",
+          "state": {
+            "_model_module": "@jupyter-widgets/controls",
+            "_model_module_version": "1.5.0",
+            "_model_name": "ProgressStyleModel",
+            "_view_count": null,
+            "_view_module": "@jupyter-widgets/base",
+            "_view_module_version": "1.2.0",
+            "_view_name": "StyleView",
+            "bar_color": null,
+            "description_width": ""
+          }
+        },
+        "6a96f4ad805142468ae84bf77ed41d3a": {
+          "model_module": "@jupyter-widgets/base",
+          "model_name": "LayoutModel",
+          "model_module_version": "1.2.0",
+          "state": {
+            "_model_module": "@jupyter-widgets/base",
+            "_model_module_version": "1.2.0",
+            "_model_name": "LayoutModel",
+            "_view_count": null,
+            "_view_module": "@jupyter-widgets/base",
+            "_view_module_version": "1.2.0",
+            "_view_name": "LayoutView",
+            "align_content": null,
+            "align_items": null,
+            "align_self": null,
+            "border": null,
+            "bottom": null,
+            "display": null,
+            "flex": null,
+            "flex_flow": null,
+            "grid_area": null,
+            "grid_auto_columns": null,
+            "grid_auto_flow": null,
+            "grid_auto_rows": null,
+            "grid_column": null,
+            "grid_gap": null,
+            "grid_row": null,
+            "grid_template_areas": null,
+            "grid_template_columns": null,
+            "grid_template_rows": null,
+            "height": null,
+            "justify_content": null,
+            "justify_items": null,
+            "left": null,
+            "margin": null,
+            "max_height": null,
+            "max_width": null,
+            "min_height": null,
+            "min_width": null,
+            "object_fit": null,
+            "object_position": null,
+            "order": null,
+            "overflow": null,
+            "overflow_x": null,
+            "overflow_y": null,
+            "padding": null,
+            "right": null,
+            "top": null,
+            "visibility": null,
+            "width": null
+          }
+        },
+        "6a751bdf2f094739b4c3cba478625f84": {
+          "model_module": "@jupyter-widgets/controls",
+          "model_name": "DescriptionStyleModel",
+          "model_module_version": "1.5.0",
+          "state": {
+            "_model_module": "@jupyter-widgets/controls",
+            "_model_module_version": "1.5.0",
+            "_model_name": "DescriptionStyleModel",
+            "_view_count": null,
+            "_view_module": "@jupyter-widgets/base",
+            "_view_module_version": "1.2.0",
+            "_view_name": "StyleView",
+            "description_width": ""
+          }
+        },
+        "c7f3959181c046c7b281c5af7d8c959c": {
+          "model_module": "@jupyter-widgets/controls",
+          "model_name": "HBoxModel",
+          "model_module_version": "1.5.0",
+          "state": {
+            "_dom_classes": [],
+            "_model_module": "@jupyter-widgets/controls",
+            "_model_module_version": "1.5.0",
+            "_model_name": "HBoxModel",
+            "_view_count": null,
+            "_view_module": "@jupyter-widgets/controls",
+            "_view_module_version": "1.5.0",
+            "_view_name": "HBoxView",
+            "box_style": "",
+            "children": [
+              "IPY_MODEL_a90ac03129fd40a2b5c9ae8d59dcdaef",
+              "IPY_MODEL_20b6fae45ebd4badbf402c81e3d24d47",
+              "IPY_MODEL_10973afff9364526b481f09d89e1d7e2"
+            ],
+            "layout": "IPY_MODEL_9d734a6342524c6dab9699468609a1d4"
+          }
+        },
+        "a90ac03129fd40a2b5c9ae8d59dcdaef": {
+          "model_module": "@jupyter-widgets/controls",
+          "model_name": "HTMLModel",
+          "model_module_version": "1.5.0",
+          "state": {
+            "_dom_classes": [],
+            "_model_module": "@jupyter-widgets/controls",
+            "_model_module_version": "1.5.0",
+            "_model_name": "HTMLModel",
+            "_view_count": null,
+            "_view_module": "@jupyter-widgets/controls",
+            "_view_module_version": "1.5.0",
+            "_view_name": "HTMLView",
+            "description": "",
+            "description_tooltip": null,
+            "layout": "IPY_MODEL_630542ff534b410dadf99e20bb6194c5",
+            "placeholder": "​",
+            "style": "IPY_MODEL_a2d9be02f65a47cfbec746acd4a3360c",
+            "value": "Downloading (…)l-00002-of-00002.bin: 100%"
+          }
+        },
+        "20b6fae45ebd4badbf402c81e3d24d47": {
+          "model_module": "@jupyter-widgets/controls",
+          "model_name": "FloatProgressModel",
+          "model_module_version": "1.5.0",
+          "state": {
+            "_dom_classes": [],
+            "_model_module": "@jupyter-widgets/controls",
+            "_model_module_version": "1.5.0",
+            "_model_name": "FloatProgressModel",
+            "_view_count": null,
+            "_view_module": "@jupyter-widgets/controls",
+            "_view_module_version": "1.5.0",
+            "_view_name": "ProgressView",
+            "bar_style": "success",
+            "description": "",
+            "description_tooltip": null,
+            "layout": "IPY_MODEL_c6948d90a1484452bc693646ad619666",
+            "max": 1949494999,
+            "min": 0,
+            "orientation": "horizontal",
+            "style": "IPY_MODEL_01641210f615480f8c239feb99f0e323",
+            "value": 1949494999
+          }
+        },
+        "10973afff9364526b481f09d89e1d7e2": {
+          "model_module": "@jupyter-widgets/controls",
+          "model_name": "HTMLModel",
+          "model_module_version": "1.5.0",
+          "state": {
+            "_dom_classes": [],
+            "_model_module": "@jupyter-widgets/controls",
+            "_model_module_version": "1.5.0",
+            "_model_name": "HTMLModel",
+            "_view_count": null,
+            "_view_module": "@jupyter-widgets/controls",
+            "_view_module_version": "1.5.0",
+            "_view_name": "HTMLView",
+            "description": "",
+            "description_tooltip": null,
+            "layout": "IPY_MODEL_0012771409a04059bb6eb85a6ad5eff2",
+            "placeholder": "​",
+            "style": "IPY_MODEL_430f70d9467041f8bea72b4a16e430de",
+            "value": " 1.95G/1.95G [00:14&lt;00:00, 150MB/s]"
+          }
+        },
+        "9d734a6342524c6dab9699468609a1d4": {
+          "model_module": "@jupyter-widgets/base",
+          "model_name": "LayoutModel",
+          "model_module_version": "1.2.0",
+          "state": {
+            "_model_module": "@jupyter-widgets/base",
+            "_model_module_version": "1.2.0",
+            "_model_name": "LayoutModel",
+            "_view_count": null,
+            "_view_module": "@jupyter-widgets/base",
+            "_view_module_version": "1.2.0",
+            "_view_name": "LayoutView",
+            "align_content": null,
+            "align_items": null,
+            "align_self": null,
+            "border": null,
+            "bottom": null,
+            "display": null,
+            "flex": null,
+            "flex_flow": null,
+            "grid_area": null,
+            "grid_auto_columns": null,
+            "grid_auto_flow": null,
+            "grid_auto_rows": null,
+            "grid_column": null,
+            "grid_gap": null,
+            "grid_row": null,
+            "grid_template_areas": null,
+            "grid_template_columns": null,
+            "grid_template_rows": null,
+            "height": null,
+            "justify_content": null,
+            "justify_items": null,
+            "left": null,
+            "margin": null,
+            "max_height": null,
+            "max_width": null,
+            "min_height": null,
+            "min_width": null,
+            "object_fit": null,
+            "object_position": null,
+            "order": null,
+            "overflow": null,
+            "overflow_x": null,
+            "overflow_y": null,
+            "padding": null,
+            "right": null,
+            "top": null,
+            "visibility": null,
+            "width": null
+          }
+        },
+        "630542ff534b410dadf99e20bb6194c5": {
+          "model_module": "@jupyter-widgets/base",
+          "model_name": "LayoutModel",
+          "model_module_version": "1.2.0",
+          "state": {
+            "_model_module": "@jupyter-widgets/base",
+            "_model_module_version": "1.2.0",
+            "_model_name": "LayoutModel",
+            "_view_count": null,
+            "_view_module": "@jupyter-widgets/base",
+            "_view_module_version": "1.2.0",
+            "_view_name": "LayoutView",
+            "align_content": null,
+            "align_items": null,
+            "align_self": null,
+            "border": null,
+            "bottom": null,
+            "display": null,
+            "flex": null,
+            "flex_flow": null,
+            "grid_area": null,
+            "grid_auto_columns": null,
+            "grid_auto_flow": null,
+            "grid_auto_rows": null,
+            "grid_column": null,
+            "grid_gap": null,
+            "grid_row": null,
+            "grid_template_areas": null,
+            "grid_template_columns": null,
+            "grid_template_rows": null,
+            "height": null,
+            "justify_content": null,
+            "justify_items": null,
+            "left": null,
+            "margin": null,
+            "max_height": null,
+            "max_width": null,
+            "min_height": null,
+            "min_width": null,
+            "object_fit": null,
+            "object_position": null,
+            "order": null,
+            "overflow": null,
+            "overflow_x": null,
+            "overflow_y": null,
+            "padding": null,
+            "right": null,
+            "top": null,
+            "visibility": null,
+            "width": null
+          }
+        },
+        "a2d9be02f65a47cfbec746acd4a3360c": {
+          "model_module": "@jupyter-widgets/controls",
+          "model_name": "DescriptionStyleModel",
+          "model_module_version": "1.5.0",
+          "state": {
+            "_model_module": "@jupyter-widgets/controls",
+            "_model_module_version": "1.5.0",
+            "_model_name": "DescriptionStyleModel",
+            "_view_count": null,
+            "_view_module": "@jupyter-widgets/base",
+            "_view_module_version": "1.2.0",
+            "_view_name": "StyleView",
+            "description_width": ""
+          }
+        },
+        "c6948d90a1484452bc693646ad619666": {
+          "model_module": "@jupyter-widgets/base",
+          "model_name": "LayoutModel",
+          "model_module_version": "1.2.0",
+          "state": {
+            "_model_module": "@jupyter-widgets/base",
+            "_model_module_version": "1.2.0",
+            "_model_name": "LayoutModel",
+            "_view_count": null,
+            "_view_module": "@jupyter-widgets/base",
+            "_view_module_version": "1.2.0",
+            "_view_name": "LayoutView",
+            "align_content": null,
+            "align_items": null,
+            "align_self": null,
+            "border": null,
+            "bottom": null,
+            "display": null,
+            "flex": null,
+            "flex_flow": null,
+            "grid_area": null,
+            "grid_auto_columns": null,
+            "grid_auto_flow": null,
+            "grid_auto_rows": null,
+            "grid_column": null,
+            "grid_gap": null,
+            "grid_row": null,
+            "grid_template_areas": null,
+            "grid_template_columns": null,
+            "grid_template_rows": null,
+            "height": null,
+            "justify_content": null,
+            "justify_items": null,
+            "left": null,
+            "margin": null,
+            "max_height": null,
+            "max_width": null,
+            "min_height": null,
+            "min_width": null,
+            "object_fit": null,
+            "object_position": null,
+            "order": null,
+            "overflow": null,
+            "overflow_x": null,
+            "overflow_y": null,
+            "padding": null,
+            "right": null,
+            "top": null,
+            "visibility": null,
+            "width": null
+          }
+        },
+        "01641210f615480f8c239feb99f0e323": {
+          "model_module": "@jupyter-widgets/controls",
+          "model_name": "ProgressStyleModel",
+          "model_module_version": "1.5.0",
+          "state": {
+            "_model_module": "@jupyter-widgets/controls",
+            "_model_module_version": "1.5.0",
+            "_model_name": "ProgressStyleModel",
+            "_view_count": null,
+            "_view_module": "@jupyter-widgets/base",
+            "_view_module_version": "1.2.0",
+            "_view_name": "StyleView",
+            "bar_color": null,
+            "description_width": ""
+          }
+        },
+        "0012771409a04059bb6eb85a6ad5eff2": {
+          "model_module": "@jupyter-widgets/base",
+          "model_name": "LayoutModel",
+          "model_module_version": "1.2.0",
+          "state": {
+            "_model_module": "@jupyter-widgets/base",
+            "_model_module_version": "1.2.0",
+            "_model_name": "LayoutModel",
+            "_view_count": null,
+            "_view_module": "@jupyter-widgets/base",
+            "_view_module_version": "1.2.0",
+            "_view_name": "LayoutView",
+            "align_content": null,
+            "align_items": null,
+            "align_self": null,
+            "border": null,
+            "bottom": null,
+            "display": null,
+            "flex": null,
+            "flex_flow": null,
+            "grid_area": null,
+            "grid_auto_columns": null,
+            "grid_auto_flow": null,
+            "grid_auto_rows": null,
+            "grid_column": null,
+            "grid_gap": null,
+            "grid_row": null,
+            "grid_template_areas": null,
+            "grid_template_columns": null,
+            "grid_template_rows": null,
+            "height": null,
+            "justify_content": null,
+            "justify_items": null,
+            "left": null,
+            "margin": null,
+            "max_height": null,
+            "max_width": null,
+            "min_height": null,
+            "min_width": null,
+            "object_fit": null,
+            "object_position": null,
+            "order": null,
+            "overflow": null,
+            "overflow_x": null,
+            "overflow_y": null,
+            "padding": null,
+            "right": null,
+            "top": null,
+            "visibility": null,
+            "width": null
+          }
+        },
+        "430f70d9467041f8bea72b4a16e430de": {
+          "model_module": "@jupyter-widgets/controls",
+          "model_name": "DescriptionStyleModel",
+          "model_module_version": "1.5.0",
+          "state": {
+            "_model_module": "@jupyter-widgets/controls",
+            "_model_module_version": "1.5.0",
+            "_model_name": "DescriptionStyleModel",
+            "_view_count": null,
+            "_view_module": "@jupyter-widgets/base",
+            "_view_module_version": "1.2.0",
+            "_view_name": "StyleView",
+            "description_width": ""
+          }
+        },
+        "88a9e1e6abdc439e9808df4235585137": {
+          "model_module": "@jupyter-widgets/controls",
+          "model_name": "HBoxModel",
+          "model_module_version": "1.5.0",
+          "state": {
+            "_dom_classes": [],
+            "_model_module": "@jupyter-widgets/controls",
+            "_model_module_version": "1.5.0",
+            "_model_name": "HBoxModel",
+            "_view_count": null,
+            "_view_module": "@jupyter-widgets/controls",
+            "_view_module_version": "1.5.0",
+            "_view_name": "HBoxView",
+            "box_style": "",
+            "children": [
+              "IPY_MODEL_6874e404870a407ab1672e15612712d8",
+              "IPY_MODEL_80c4510fff3e44c58b52e314d4ce1174",
+              "IPY_MODEL_45771a1396094a2e9f8bb6dc6800b908"
+            ],
+            "layout": "IPY_MODEL_b3b373bdfc26499dbf400667b2edc6b4"
+          }
+        },
+        "6874e404870a407ab1672e15612712d8": {
+          "model_module": "@jupyter-widgets/controls",
+          "model_name": "HTMLModel",
+          "model_module_version": "1.5.0",
+          "state": {
+            "_dom_classes": [],
+            "_model_module": "@jupyter-widgets/controls",
+            "_model_module_version": "1.5.0",
+            "_model_name": "HTMLModel",
+            "_view_count": null,
+            "_view_module": "@jupyter-widgets/controls",
+            "_view_module_version": "1.5.0",
+            "_view_name": "HTMLView",
+            "description": "",
+            "description_tooltip": null,
+            "layout": "IPY_MODEL_8c93c66cdf0a44d4ab9bcc4df497d1ec",
+            "placeholder": "​",
+            "style": "IPY_MODEL_cd379c6435394897a3906d4bc6b71345",
+            "value": "Loading checkpoint shards: 100%"
+          }
+        },
+        "80c4510fff3e44c58b52e314d4ce1174": {
+          "model_module": "@jupyter-widgets/controls",
+          "model_name": "FloatProgressModel",
+          "model_module_version": "1.5.0",
+          "state": {
+            "_dom_classes": [],
+            "_model_module": "@jupyter-widgets/controls",
+            "_model_module_version": "1.5.0",
+            "_model_name": "FloatProgressModel",
+            "_view_count": null,
+            "_view_module": "@jupyter-widgets/controls",
+            "_view_module_version": "1.5.0",
+            "_view_name": "ProgressView",
+            "bar_style": "success",
+            "description": "",
+            "description_tooltip": null,
+            "layout": "IPY_MODEL_5d198996c0344b8f92409b1cbff19af4",
+            "max": 2,
+            "min": 0,
+            "orientation": "horizontal",
+            "style": "IPY_MODEL_d965c2e080bd452a99954a3b3af2c147",
+            "value": 2
+          }
+        },
+        "45771a1396094a2e9f8bb6dc6800b908": {
+          "model_module": "@jupyter-widgets/controls",
+          "model_name": "HTMLModel",
+          "model_module_version": "1.5.0",
+          "state": {
+            "_dom_classes": [],
+            "_model_module": "@jupyter-widgets/controls",
+            "_model_module_version": "1.5.0",
+            "_model_name": "HTMLModel",
+            "_view_count": null,
+            "_view_module": "@jupyter-widgets/controls",
+            "_view_module_version": "1.5.0",
+            "_view_name": "HTMLView",
+            "description": "",
+            "description_tooltip": null,
+            "layout": "IPY_MODEL_b79886293822421895753aa0e2d2d170",
+            "placeholder": "​",
+            "style": "IPY_MODEL_f27d00a5f5964ee684787d97f132834a",
+            "value": " 2/2 [00:52&lt;00:00, 23.26s/it]"
+          }
+        },
+        "b3b373bdfc26499dbf400667b2edc6b4": {
+          "model_module": "@jupyter-widgets/base",
+          "model_name": "LayoutModel",
+          "model_module_version": "1.2.0",
+          "state": {
+            "_model_module": "@jupyter-widgets/base",
+            "_model_module_version": "1.2.0",
+            "_model_name": "LayoutModel",
+            "_view_count": null,
+            "_view_module": "@jupyter-widgets/base",
+            "_view_module_version": "1.2.0",
+            "_view_name": "LayoutView",
+            "align_content": null,
+            "align_items": null,
+            "align_self": null,
+            "border": null,
+            "bottom": null,
+            "display": null,
+            "flex": null,
+            "flex_flow": null,
+            "grid_area": null,
+            "grid_auto_columns": null,
+            "grid_auto_flow": null,
+            "grid_auto_rows": null,
+            "grid_column": null,
+            "grid_gap": null,
+            "grid_row": null,
+            "grid_template_areas": null,
+            "grid_template_columns": null,
+            "grid_template_rows": null,
+            "height": null,
+            "justify_content": null,
+            "justify_items": null,
+            "left": null,
+            "margin": null,
+            "max_height": null,
+            "max_width": null,
+            "min_height": null,
+            "min_width": null,
+            "object_fit": null,
+            "object_position": null,
+            "order": null,
+            "overflow": null,
+            "overflow_x": null,
+            "overflow_y": null,
+            "padding": null,
+            "right": null,
+            "top": null,
+            "visibility": null,
+            "width": null
+          }
+        },
+        "8c93c66cdf0a44d4ab9bcc4df497d1ec": {
+          "model_module": "@jupyter-widgets/base",
+          "model_name": "LayoutModel",
+          "model_module_version": "1.2.0",
+          "state": {
+            "_model_module": "@jupyter-widgets/base",
+            "_model_module_version": "1.2.0",
+            "_model_name": "LayoutModel",
+            "_view_count": null,
+            "_view_module": "@jupyter-widgets/base",
+            "_view_module_version": "1.2.0",
+            "_view_name": "LayoutView",
+            "align_content": null,
+            "align_items": null,
+            "align_self": null,
+            "border": null,
+            "bottom": null,
+            "display": null,
+            "flex": null,
+            "flex_flow": null,
+            "grid_area": null,
+            "grid_auto_columns": null,
+            "grid_auto_flow": null,
+            "grid_auto_rows": null,
+            "grid_column": null,
+            "grid_gap": null,
+            "grid_row": null,
+            "grid_template_areas": null,
+            "grid_template_columns": null,
+            "grid_template_rows": null,
+            "height": null,
+            "justify_content": null,
+            "justify_items": null,
+            "left": null,
+            "margin": null,
+            "max_height": null,
+            "max_width": null,
+            "min_height": null,
+            "min_width": null,
+            "object_fit": null,
+            "object_position": null,
+            "order": null,
+            "overflow": null,
+            "overflow_x": null,
+            "overflow_y": null,
+            "padding": null,
+            "right": null,
+            "top": null,
+            "visibility": null,
+            "width": null
+          }
+        },
+        "cd379c6435394897a3906d4bc6b71345": {
+          "model_module": "@jupyter-widgets/controls",
+          "model_name": "DescriptionStyleModel",
+          "model_module_version": "1.5.0",
+          "state": {
+            "_model_module": "@jupyter-widgets/controls",
+            "_model_module_version": "1.5.0",
+            "_model_name": "DescriptionStyleModel",
+            "_view_count": null,
+            "_view_module": "@jupyter-widgets/base",
+            "_view_module_version": "1.2.0",
+            "_view_name": "StyleView",
+            "description_width": ""
+          }
+        },
+        "5d198996c0344b8f92409b1cbff19af4": {
+          "model_module": "@jupyter-widgets/base",
+          "model_name": "LayoutModel",
+          "model_module_version": "1.2.0",
+          "state": {
+            "_model_module": "@jupyter-widgets/base",
+            "_model_module_version": "1.2.0",
+            "_model_name": "LayoutModel",
+            "_view_count": null,
+            "_view_module": "@jupyter-widgets/base",
+            "_view_module_version": "1.2.0",
+            "_view_name": "LayoutView",
+            "align_content": null,
+            "align_items": null,
+            "align_self": null,
+            "border": null,
+            "bottom": null,
+            "display": null,
+            "flex": null,
+            "flex_flow": null,
+            "grid_area": null,
+            "grid_auto_columns": null,
+            "grid_auto_flow": null,
+            "grid_auto_rows": null,
+            "grid_column": null,
+            "grid_gap": null,
+            "grid_row": null,
+            "grid_template_areas": null,
+            "grid_template_columns": null,
+            "grid_template_rows": null,
+            "height": null,
+            "justify_content": null,
+            "justify_items": null,
+            "left": null,
+            "margin": null,
+            "max_height": null,
+            "max_width": null,
+            "min_height": null,
+            "min_width": null,
+            "object_fit": null,
+            "object_position": null,
+            "order": null,
+            "overflow": null,
+            "overflow_x": null,
+            "overflow_y": null,
+            "padding": null,
+            "right": null,
+            "top": null,
+            "visibility": null,
+            "width": null
+          }
+        },
+        "d965c2e080bd452a99954a3b3af2c147": {
+          "model_module": "@jupyter-widgets/controls",
+          "model_name": "ProgressStyleModel",
+          "model_module_version": "1.5.0",
+          "state": {
+            "_model_module": "@jupyter-widgets/controls",
+            "_model_module_version": "1.5.0",
+            "_model_name": "ProgressStyleModel",
+            "_view_count": null,
+            "_view_module": "@jupyter-widgets/base",
+            "_view_module_version": "1.2.0",
+            "_view_name": "StyleView",
+            "bar_color": null,
+            "description_width": ""
+          }
+        },
+        "b79886293822421895753aa0e2d2d170": {
+          "model_module": "@jupyter-widgets/base",
+          "model_name": "LayoutModel",
+          "model_module_version": "1.2.0",
+          "state": {
+            "_model_module": "@jupyter-widgets/base",
+            "_model_module_version": "1.2.0",
+            "_model_name": "LayoutModel",
+            "_view_count": null,
+            "_view_module": "@jupyter-widgets/base",
+            "_view_module_version": "1.2.0",
+            "_view_name": "LayoutView",
+            "align_content": null,
+            "align_items": null,
+            "align_self": null,
+            "border": null,
+            "bottom": null,
+            "display": null,
+            "flex": null,
+            "flex_flow": null,
+            "grid_area": null,
+            "grid_auto_columns": null,
+            "grid_auto_flow": null,
+            "grid_auto_rows": null,
+            "grid_column": null,
+            "grid_gap": null,
+            "grid_row": null,
+            "grid_template_areas": null,
+            "grid_template_columns": null,
+            "grid_template_rows": null,
+            "height": null,
+            "justify_content": null,
+            "justify_items": null,
+            "left": null,
+            "margin": null,
+            "max_height": null,
+            "max_width": null,
+            "min_height": null,
+            "min_width": null,
+            "object_fit": null,
+            "object_position": null,
+            "order": null,
+            "overflow": null,
+            "overflow_x": null,
+            "overflow_y": null,
+            "padding": null,
+            "right": null,
+            "top": null,
+            "visibility": null,
+            "width": null
+          }
+        },
+        "f27d00a5f5964ee684787d97f132834a": {
+          "model_module": "@jupyter-widgets/controls",
+          "model_name": "DescriptionStyleModel",
+          "model_module_version": "1.5.0",
+          "state": {
+            "_model_module": "@jupyter-widgets/controls",
+            "_model_module_version": "1.5.0",
+            "_model_name": "DescriptionStyleModel",
+            "_view_count": null,
+            "_view_module": "@jupyter-widgets/base",
+            "_view_module_version": "1.2.0",
+            "_view_name": "StyleView",
+            "description_width": ""
+          }
+        },
+        "2f214f129655487baff53148920c7c95": {
+          "model_module": "@jupyter-widgets/controls",
+          "model_name": "HBoxModel",
+          "model_module_version": "1.5.0",
+          "state": {
+            "_dom_classes": [],
+            "_model_module": "@jupyter-widgets/controls",
+            "_model_module_version": "1.5.0",
+            "_model_name": "HBoxModel",
+            "_view_count": null,
+            "_view_module": "@jupyter-widgets/controls",
+            "_view_module_version": "1.5.0",
+            "_view_name": "HBoxView",
+            "box_style": "",
+            "children": [
+              "IPY_MODEL_ef32e3942ca14835aaa1fcd1bfc80018",
+              "IPY_MODEL_43d505d0a0294e4abd4878f53d7f7bc8",
+              "IPY_MODEL_297ed97f4dbb43aba188a337e1d8bf26"
+            ],
+            "layout": "IPY_MODEL_0e2a8d04b5a143bc9bfa77f7b538954e"
+          }
+        },
+        "ef32e3942ca14835aaa1fcd1bfc80018": {
+          "model_module": "@jupyter-widgets/controls",
+          "model_name": "HTMLModel",
+          "model_module_version": "1.5.0",
+          "state": {
+            "_dom_classes": [],
+            "_model_module": "@jupyter-widgets/controls",
+            "_model_module_version": "1.5.0",
+            "_model_name": "HTMLModel",
+            "_view_count": null,
+            "_view_module": "@jupyter-widgets/controls",
+            "_view_module_version": "1.5.0",
+            "_view_name": "HTMLView",
+            "description": "",
+            "description_tooltip": null,
+            "layout": "IPY_MODEL_f99a48ae06114986aa4249d0876962a8",
+            "placeholder": "​",
+            "style": "IPY_MODEL_025cd7df6ab64885bb7939f28a28be28",
+            "value": "Downloading (…)neration_config.json: 100%"
+          }
+        },
+        "43d505d0a0294e4abd4878f53d7f7bc8": {
+          "model_module": "@jupyter-widgets/controls",
+          "model_name": "FloatProgressModel",
+          "model_module_version": "1.5.0",
+          "state": {
+            "_dom_classes": [],
+            "_model_module": "@jupyter-widgets/controls",
+            "_model_module_version": "1.5.0",
+            "_model_name": "FloatProgressModel",
+            "_view_count": null,
+            "_view_module": "@jupyter-widgets/controls",
+            "_view_module_version": "1.5.0",
+            "_view_name": "ProgressView",
+            "bar_style": "success",
+            "description": "",
+            "description_tooltip": null,
+            "layout": "IPY_MODEL_71401278061d4f55af185007f7e21289",
+            "max": 147,
+            "min": 0,
+            "orientation": "horizontal",
+            "style": "IPY_MODEL_36e7f52522764ab9bd5669366c49a8ea",
+            "value": 147
+          }
+        },
+        "297ed97f4dbb43aba188a337e1d8bf26": {
+          "model_module": "@jupyter-widgets/controls",
+          "model_name": "HTMLModel",
+          "model_module_version": "1.5.0",
+          "state": {
+            "_dom_classes": [],
+            "_model_module": "@jupyter-widgets/controls",
+            "_model_module_version": "1.5.0",
+            "_model_name": "HTMLModel",
+            "_view_count": null,
+            "_view_module": "@jupyter-widgets/controls",
+            "_view_module_version": "1.5.0",
+            "_view_name": "HTMLView",
+            "description": "",
+            "description_tooltip": null,
+            "layout": "IPY_MODEL_2d9b52f788d34d4d94b6e1b17f0a6ace",
+            "placeholder": "​",
+            "style": "IPY_MODEL_5c662a378c2a4529a0c742fb0a0210fe",
+            "value": " 147/147 [00:00&lt;00:00, 8.15kB/s]"
+          }
+        },
+        "0e2a8d04b5a143bc9bfa77f7b538954e": {
+          "model_module": "@jupyter-widgets/base",
+          "model_name": "LayoutModel",
+          "model_module_version": "1.2.0",
+          "state": {
+            "_model_module": "@jupyter-widgets/base",
+            "_model_module_version": "1.2.0",
+            "_model_name": "LayoutModel",
+            "_view_count": null,
+            "_view_module": "@jupyter-widgets/base",
+            "_view_module_version": "1.2.0",
+            "_view_name": "LayoutView",
+            "align_content": null,
+            "align_items": null,
+            "align_self": null,
+            "border": null,
+            "bottom": null,
+            "display": null,
+            "flex": null,
+            "flex_flow": null,
+            "grid_area": null,
+            "grid_auto_columns": null,
+            "grid_auto_flow": null,
+            "grid_auto_rows": null,
+            "grid_column": null,
+            "grid_gap": null,
+            "grid_row": null,
+            "grid_template_areas": null,
+            "grid_template_columns": null,
+            "grid_template_rows": null,
+            "height": null,
+            "justify_content": null,
+            "justify_items": null,
+            "left": null,
+            "margin": null,
+            "max_height": null,
+            "max_width": null,
+            "min_height": null,
+            "min_width": null,
+            "object_fit": null,
+            "object_position": null,
+            "order": null,
+            "overflow": null,
+            "overflow_x": null,
+            "overflow_y": null,
+            "padding": null,
+            "right": null,
+            "top": null,
+            "visibility": null,
+            "width": null
+          }
+        },
+        "f99a48ae06114986aa4249d0876962a8": {
+          "model_module": "@jupyter-widgets/base",
+          "model_name": "LayoutModel",
+          "model_module_version": "1.2.0",
+          "state": {
+            "_model_module": "@jupyter-widgets/base",
+            "_model_module_version": "1.2.0",
+            "_model_name": "LayoutModel",
+            "_view_count": null,
+            "_view_module": "@jupyter-widgets/base",
+            "_view_module_version": "1.2.0",
+            "_view_name": "LayoutView",
+            "align_content": null,
+            "align_items": null,
+            "align_self": null,
+            "border": null,
+            "bottom": null,
+            "display": null,
+            "flex": null,
+            "flex_flow": null,
+            "grid_area": null,
+            "grid_auto_columns": null,
+            "grid_auto_flow": null,
+            "grid_auto_rows": null,
+            "grid_column": null,
+            "grid_gap": null,
+            "grid_row": null,
+            "grid_template_areas": null,
+            "grid_template_columns": null,
+            "grid_template_rows": null,
+            "height": null,
+            "justify_content": null,
+            "justify_items": null,
+            "left": null,
+            "margin": null,
+            "max_height": null,
+            "max_width": null,
+            "min_height": null,
+            "min_width": null,
+            "object_fit": null,
+            "object_position": null,
+            "order": null,
+            "overflow": null,
+            "overflow_x": null,
+            "overflow_y": null,
+            "padding": null,
+            "right": null,
+            "top": null,
+            "visibility": null,
+            "width": null
+          }
+        },
+        "025cd7df6ab64885bb7939f28a28be28": {
+          "model_module": "@jupyter-widgets/controls",
+          "model_name": "DescriptionStyleModel",
+          "model_module_version": "1.5.0",
+          "state": {
+            "_model_module": "@jupyter-widgets/controls",
+            "_model_module_version": "1.5.0",
+            "_model_name": "DescriptionStyleModel",
+            "_view_count": null,
+            "_view_module": "@jupyter-widgets/base",
+            "_view_module_version": "1.2.0",
+            "_view_name": "StyleView",
+            "description_width": ""
+          }
+        },
+        "71401278061d4f55af185007f7e21289": {
+          "model_module": "@jupyter-widgets/base",
+          "model_name": "LayoutModel",
+          "model_module_version": "1.2.0",
+          "state": {
+            "_model_module": "@jupyter-widgets/base",
+            "_model_module_version": "1.2.0",
+            "_model_name": "LayoutModel",
+            "_view_count": null,
+            "_view_module": "@jupyter-widgets/base",
+            "_view_module_version": "1.2.0",
+            "_view_name": "LayoutView",
+            "align_content": null,
+            "align_items": null,
+            "align_self": null,
+            "border": null,
+            "bottom": null,
+            "display": null,
+            "flex": null,
+            "flex_flow": null,
+            "grid_area": null,
+            "grid_auto_columns": null,
+            "grid_auto_flow": null,
+            "grid_auto_rows": null,
+            "grid_column": null,
+            "grid_gap": null,
+            "grid_row": null,
+            "grid_template_areas": null,
+            "grid_template_columns": null,
+            "grid_template_rows": null,
+            "height": null,
+            "justify_content": null,
+            "justify_items": null,
+            "left": null,
+            "margin": null,
+            "max_height": null,
+            "max_width": null,
+            "min_height": null,
+            "min_width": null,
+            "object_fit": null,
+            "object_position": null,
+            "order": null,
+            "overflow": null,
+            "overflow_x": null,
+            "overflow_y": null,
+            "padding": null,
+            "right": null,
+            "top": null,
+            "visibility": null,
+            "width": null
+          }
+        },
+        "36e7f52522764ab9bd5669366c49a8ea": {
+          "model_module": "@jupyter-widgets/controls",
+          "model_name": "ProgressStyleModel",
+          "model_module_version": "1.5.0",
+          "state": {
+            "_model_module": "@jupyter-widgets/controls",
+            "_model_module_version": "1.5.0",
+            "_model_name": "ProgressStyleModel",
+            "_view_count": null,
+            "_view_module": "@jupyter-widgets/base",
+            "_view_module_version": "1.2.0",
+            "_view_name": "StyleView",
+            "bar_color": null,
+            "description_width": ""
+          }
+        },
+        "2d9b52f788d34d4d94b6e1b17f0a6ace": {
+          "model_module": "@jupyter-widgets/base",
+          "model_name": "LayoutModel",
+          "model_module_version": "1.2.0",
+          "state": {
+            "_model_module": "@jupyter-widgets/base",
+            "_model_module_version": "1.2.0",
+            "_model_name": "LayoutModel",
+            "_view_count": null,
+            "_view_module": "@jupyter-widgets/base",
+            "_view_module_version": "1.2.0",
+            "_view_name": "LayoutView",
+            "align_content": null,
+            "align_items": null,
+            "align_self": null,
+            "border": null,
+            "bottom": null,
+            "display": null,
+            "flex": null,
+            "flex_flow": null,
+            "grid_area": null,
+            "grid_auto_columns": null,
+            "grid_auto_flow": null,
+            "grid_auto_rows": null,
+            "grid_column": null,
+            "grid_gap": null,
+            "grid_row": null,
+            "grid_template_areas": null,
+            "grid_template_columns": null,
+            "grid_template_rows": null,
+            "height": null,
+            "justify_content": null,
+            "justify_items": null,
+            "left": null,
+            "margin": null,
+            "max_height": null,
+            "max_width": null,
+            "min_height": null,
+            "min_width": null,
+            "object_fit": null,
+            "object_position": null,
+            "order": null,
+            "overflow": null,
+            "overflow_x": null,
+            "overflow_y": null,
+            "padding": null,
+            "right": null,
+            "top": null,
+            "visibility": null,
+            "width": null
+          }
+        },
+        "5c662a378c2a4529a0c742fb0a0210fe": {
+          "model_module": "@jupyter-widgets/controls",
+          "model_name": "DescriptionStyleModel",
+          "model_module_version": "1.5.0",
+          "state": {
+            "_model_module": "@jupyter-widgets/controls",
+            "_model_module_version": "1.5.0",
+            "_model_name": "DescriptionStyleModel",
+            "_view_count": null,
+            "_view_module": "@jupyter-widgets/base",
+            "_view_module_version": "1.2.0",
+            "_view_name": "StyleView",
+            "description_width": ""
+          }
+        },
+        "db160865302c401facf54afd53f972d4": {
+          "model_module": "@jupyter-widgets/controls",
+          "model_name": "HBoxModel",
+          "model_module_version": "1.5.0",
+          "state": {
+            "_dom_classes": [],
+            "_model_module": "@jupyter-widgets/controls",
+            "_model_module_version": "1.5.0",
+            "_model_name": "HBoxModel",
+            "_view_count": null,
+            "_view_module": "@jupyter-widgets/controls",
+            "_view_module_version": "1.5.0",
+            "_view_name": "HBoxView",
+            "box_style": "",
+            "children": [
+              "IPY_MODEL_60d5faa861404744bd685257af4be791",
+              "IPY_MODEL_de92bc34f9d04ad18e0bbfb8cfdc4e3b",
+              "IPY_MODEL_077954e270e444caa52c3f6bbcf20817"
+            ],
+            "layout": "IPY_MODEL_8ecb7100390e4d0cb3425b173dcf07f1"
+          }
+        },
+        "60d5faa861404744bd685257af4be791": {
+          "model_module": "@jupyter-widgets/controls",
+          "model_name": "HTMLModel",
+          "model_module_version": "1.5.0",
+          "state": {
+            "_dom_classes": [],
+            "_model_module": "@jupyter-widgets/controls",
+            "_model_module_version": "1.5.0",
+            "_model_name": "HTMLModel",
+            "_view_count": null,
+            "_view_module": "@jupyter-widgets/controls",
+            "_view_module_version": "1.5.0",
+            "_view_name": "HTMLView",
+            "description": "",
+            "description_tooltip": null,
+            "layout": "IPY_MODEL_12a0cb44ed93433ebbc9f6c80c4e494c",
+            "placeholder": "​",
+            "style": "IPY_MODEL_a7b74c6b00384f3f905e3fb8a8c3f216",
+            "value": "Downloading spiece.model: 100%"
+          }
+        },
+        "de92bc34f9d04ad18e0bbfb8cfdc4e3b": {
+          "model_module": "@jupyter-widgets/controls",
+          "model_name": "FloatProgressModel",
+          "model_module_version": "1.5.0",
+          "state": {
+            "_dom_classes": [],
+            "_model_module": "@jupyter-widgets/controls",
+            "_model_module_version": "1.5.0",
+            "_model_name": "FloatProgressModel",
+            "_view_count": null,
+            "_view_module": "@jupyter-widgets/controls",
+            "_view_module_version": "1.5.0",
+            "_view_name": "ProgressView",
+            "bar_style": "success",
+            "description": "",
+            "description_tooltip": null,
+            "layout": "IPY_MODEL_c92cfb1a34af4d35ab4a1cbb552f1a8b",
+            "max": 791656,
+            "min": 0,
+            "orientation": "horizontal",
+            "style": "IPY_MODEL_dea195394d484e6f96e097ac45eb926a",
+            "value": 791656
+          }
+        },
+        "077954e270e444caa52c3f6bbcf20817": {
+          "model_module": "@jupyter-widgets/controls",
+          "model_name": "HTMLModel",
+          "model_module_version": "1.5.0",
+          "state": {
+            "_dom_classes": [],
+            "_model_module": "@jupyter-widgets/controls",
+            "_model_module_version": "1.5.0",
+            "_model_name": "HTMLModel",
+            "_view_count": null,
+            "_view_module": "@jupyter-widgets/controls",
+            "_view_module_version": "1.5.0",
+            "_view_name": "HTMLView",
+            "description": "",
+            "description_tooltip": null,
+            "layout": "IPY_MODEL_7a381e3cefcf48a1bc188c49985f2bf2",
+            "placeholder": "​",
+            "style": "IPY_MODEL_40e7ef29f37e414cb27fc6798ec3dd69",
+            "value": " 792k/792k [00:00&lt;00:00, 41.4MB/s]"
+          }
+        },
+        "8ecb7100390e4d0cb3425b173dcf07f1": {
+          "model_module": "@jupyter-widgets/base",
+          "model_name": "LayoutModel",
+          "model_module_version": "1.2.0",
+          "state": {
+            "_model_module": "@jupyter-widgets/base",
+            "_model_module_version": "1.2.0",
+            "_model_name": "LayoutModel",
+            "_view_count": null,
+            "_view_module": "@jupyter-widgets/base",
+            "_view_module_version": "1.2.0",
+            "_view_name": "LayoutView",
+            "align_content": null,
+            "align_items": null,
+            "align_self": null,
+            "border": null,
+            "bottom": null,
+            "display": null,
+            "flex": null,
+            "flex_flow": null,
+            "grid_area": null,
+            "grid_auto_columns": null,
+            "grid_auto_flow": null,
+            "grid_auto_rows": null,
+            "grid_column": null,
+            "grid_gap": null,
+            "grid_row": null,
+            "grid_template_areas": null,
+            "grid_template_columns": null,
+            "grid_template_rows": null,
+            "height": null,
+            "justify_content": null,
+            "justify_items": null,
+            "left": null,
+            "margin": null,
+            "max_height": null,
+            "max_width": null,
+            "min_height": null,
+            "min_width": null,
+            "object_fit": null,
+            "object_position": null,
+            "order": null,
+            "overflow": null,
+            "overflow_x": null,
+            "overflow_y": null,
+            "padding": null,
+            "right": null,
+            "top": null,
+            "visibility": null,
+            "width": null
+          }
+        },
+        "12a0cb44ed93433ebbc9f6c80c4e494c": {
+          "model_module": "@jupyter-widgets/base",
+          "model_name": "LayoutModel",
+          "model_module_version": "1.2.0",
+          "state": {
+            "_model_module": "@jupyter-widgets/base",
+            "_model_module_version": "1.2.0",
+            "_model_name": "LayoutModel",
+            "_view_count": null,
+            "_view_module": "@jupyter-widgets/base",
+            "_view_module_version": "1.2.0",
+            "_view_name": "LayoutView",
+            "align_content": null,
+            "align_items": null,
+            "align_self": null,
+            "border": null,
+            "bottom": null,
+            "display": null,
+            "flex": null,
+            "flex_flow": null,
+            "grid_area": null,
+            "grid_auto_columns": null,
+            "grid_auto_flow": null,
+            "grid_auto_rows": null,
+            "grid_column": null,
+            "grid_gap": null,
+            "grid_row": null,
+            "grid_template_areas": null,
+            "grid_template_columns": null,
+            "grid_template_rows": null,
+            "height": null,
+            "justify_content": null,
+            "justify_items": null,
+            "left": null,
+            "margin": null,
+            "max_height": null,
+            "max_width": null,
+            "min_height": null,
+            "min_width": null,
+            "object_fit": null,
+            "object_position": null,
+            "order": null,
+            "overflow": null,
+            "overflow_x": null,
+            "overflow_y": null,
+            "padding": null,
+            "right": null,
+            "top": null,
+            "visibility": null,
+            "width": null
+          }
+        },
+        "a7b74c6b00384f3f905e3fb8a8c3f216": {
+          "model_module": "@jupyter-widgets/controls",
+          "model_name": "DescriptionStyleModel",
+          "model_module_version": "1.5.0",
+          "state": {
+            "_model_module": "@jupyter-widgets/controls",
+            "_model_module_version": "1.5.0",
+            "_model_name": "DescriptionStyleModel",
+            "_view_count": null,
+            "_view_module": "@jupyter-widgets/base",
+            "_view_module_version": "1.2.0",
+            "_view_name": "StyleView",
+            "description_width": ""
+          }
+        },
+        "c92cfb1a34af4d35ab4a1cbb552f1a8b": {
+          "model_module": "@jupyter-widgets/base",
+          "model_name": "LayoutModel",
+          "model_module_version": "1.2.0",
+          "state": {
+            "_model_module": "@jupyter-widgets/base",
+            "_model_module_version": "1.2.0",
+            "_model_name": "LayoutModel",
+            "_view_count": null,
+            "_view_module": "@jupyter-widgets/base",
+            "_view_module_version": "1.2.0",
+            "_view_name": "LayoutView",
+            "align_content": null,
+            "align_items": null,
+            "align_self": null,
+            "border": null,
+            "bottom": null,
+            "display": null,
+            "flex": null,
+            "flex_flow": null,
+            "grid_area": null,
+            "grid_auto_columns": null,
+            "grid_auto_flow": null,
+            "grid_auto_rows": null,
+            "grid_column": null,
+            "grid_gap": null,
+            "grid_row": null,
+            "grid_template_areas": null,
+            "grid_template_columns": null,
+            "grid_template_rows": null,
+            "height": null,
+            "justify_content": null,
+            "justify_items": null,
+            "left": null,
+            "margin": null,
+            "max_height": null,
+            "max_width": null,
+            "min_height": null,
+            "min_width": null,
+            "object_fit": null,
+            "object_position": null,
+            "order": null,
+            "overflow": null,
+            "overflow_x": null,
+            "overflow_y": null,
+            "padding": null,
+            "right": null,
+            "top": null,
+            "visibility": null,
+            "width": null
+          }
+        },
+        "dea195394d484e6f96e097ac45eb926a": {
+          "model_module": "@jupyter-widgets/controls",
+          "model_name": "ProgressStyleModel",
+          "model_module_version": "1.5.0",
+          "state": {
+            "_model_module": "@jupyter-widgets/controls",
+            "_model_module_version": "1.5.0",
+            "_model_name": "ProgressStyleModel",
+            "_view_count": null,
+            "_view_module": "@jupyter-widgets/base",
+            "_view_module_version": "1.2.0",
+            "_view_name": "StyleView",
+            "bar_color": null,
+            "description_width": ""
+          }
+        },
+        "7a381e3cefcf48a1bc188c49985f2bf2": {
+          "model_module": "@jupyter-widgets/base",
+          "model_name": "LayoutModel",
+          "model_module_version": "1.2.0",
+          "state": {
+            "_model_module": "@jupyter-widgets/base",
+            "_model_module_version": "1.2.0",
+            "_model_name": "LayoutModel",
+            "_view_count": null,
+            "_view_module": "@jupyter-widgets/base",
+            "_view_module_version": "1.2.0",
+            "_view_name": "LayoutView",
+            "align_content": null,
+            "align_items": null,
+            "align_self": null,
+            "border": null,
+            "bottom": null,
+            "display": null,
+            "flex": null,
+            "flex_flow": null,
+            "grid_area": null,
+            "grid_auto_columns": null,
+            "grid_auto_flow": null,
+            "grid_auto_rows": null,
+            "grid_column": null,
+            "grid_gap": null,
+            "grid_row": null,
+            "grid_template_areas": null,
+            "grid_template_columns": null,
+            "grid_template_rows": null,
+            "height": null,
+            "justify_content": null,
+            "justify_items": null,
+            "left": null,
+            "margin": null,
+            "max_height": null,
+            "max_width": null,
+            "min_height": null,
+            "min_width": null,
+            "object_fit": null,
+            "object_position": null,
+            "order": null,
+            "overflow": null,
+            "overflow_x": null,
+            "overflow_y": null,
+            "padding": null,
+            "right": null,
+            "top": null,
+            "visibility": null,
+            "width": null
+          }
+        },
+        "40e7ef29f37e414cb27fc6798ec3dd69": {
+          "model_module": "@jupyter-widgets/controls",
+          "model_name": "DescriptionStyleModel",
+          "model_module_version": "1.5.0",
+          "state": {
+            "_model_module": "@jupyter-widgets/controls",
+            "_model_module_version": "1.5.0",
+            "_model_name": "DescriptionStyleModel",
+            "_view_count": null,
+            "_view_module": "@jupyter-widgets/base",
+            "_view_module_version": "1.2.0",
+            "_view_name": "StyleView",
+            "description_width": ""
+          }
+        },
+        "c1aadb4cc2754a56af360a1534dc9c07": {
+          "model_module": "@jupyter-widgets/controls",
+          "model_name": "HBoxModel",
+          "model_module_version": "1.5.0",
+          "state": {
+            "_dom_classes": [],
+            "_model_module": "@jupyter-widgets/controls",
+            "_model_module_version": "1.5.0",
+            "_model_name": "HBoxModel",
+            "_view_count": null,
+            "_view_module": "@jupyter-widgets/controls",
+            "_view_module_version": "1.5.0",
+            "_view_name": "HBoxView",
+            "box_style": "",
+            "children": [
+              "IPY_MODEL_b9874db4b125444c9c1e461ec982ab1d",
+              "IPY_MODEL_e78eef1b3587420cb805c2899d3d50f5",
+              "IPY_MODEL_a51b630132cc4f8dacd2b59666ca0640"
+            ],
+            "layout": "IPY_MODEL_e5c5b8a19d5b46269c203e313ac56ca2"
+          }
+        },
+        "b9874db4b125444c9c1e461ec982ab1d": {
+          "model_module": "@jupyter-widgets/controls",
+          "model_name": "HTMLModel",
+          "model_module_version": "1.5.0",
+          "state": {
+            "_dom_classes": [],
+            "_model_module": "@jupyter-widgets/controls",
+            "_model_module_version": "1.5.0",
+            "_model_name": "HTMLModel",
+            "_view_count": null,
+            "_view_module": "@jupyter-widgets/controls",
+            "_view_module_version": "1.5.0",
+            "_view_name": "HTMLView",
+            "description": "",
+            "description_tooltip": null,
+            "layout": "IPY_MODEL_3b92d3a3cb0e4c19bf9fead4422a5554",
+            "placeholder": "​",
+            "style": "IPY_MODEL_34ad0d26150f4b589149d4ece6bf877a",
+            "value": "Downloading (…)/main/tokenizer.json: 100%"
+          }
+        },
+        "e78eef1b3587420cb805c2899d3d50f5": {
+          "model_module": "@jupyter-widgets/controls",
+          "model_name": "FloatProgressModel",
+          "model_module_version": "1.5.0",
+          "state": {
+            "_dom_classes": [],
+            "_model_module": "@jupyter-widgets/controls",
+            "_model_module_version": "1.5.0",
+            "_model_name": "FloatProgressModel",
+            "_view_count": null,
+            "_view_module": "@jupyter-widgets/controls",
+            "_view_module_version": "1.5.0",
+            "_view_name": "ProgressView",
+            "bar_style": "success",
+            "description": "",
+            "description_tooltip": null,
+            "layout": "IPY_MODEL_d6a31ea687064ceebf54819168df219c",
+            "max": 1389353,
+            "min": 0,
+            "orientation": "horizontal",
+            "style": "IPY_MODEL_f88fc2141c524793b3266f2ead90b42b",
+            "value": 1389353
+          }
+        },
+        "a51b630132cc4f8dacd2b59666ca0640": {
+          "model_module": "@jupyter-widgets/controls",
+          "model_name": "HTMLModel",
+          "model_module_version": "1.5.0",
+          "state": {
+            "_dom_classes": [],
+            "_model_module": "@jupyter-widgets/controls",
+            "_model_module_version": "1.5.0",
+            "_model_name": "HTMLModel",
+            "_view_count": null,
+            "_view_module": "@jupyter-widgets/controls",
+            "_view_module_version": "1.5.0",
+            "_view_name": "HTMLView",
+            "description": "",
+            "description_tooltip": null,
+            "layout": "IPY_MODEL_d868549bbf9241e8b20a922ad5426b3e",
+            "placeholder": "​",
+            "style": "IPY_MODEL_4fa6dc96da8f49929249cc061a052167",
+            "value": " 1.39M/1.39M [00:00&lt;00:00, 15.0MB/s]"
+          }
+        },
+        "e5c5b8a19d5b46269c203e313ac56ca2": {
+          "model_module": "@jupyter-widgets/base",
+          "model_name": "LayoutModel",
+          "model_module_version": "1.2.0",
+          "state": {
+            "_model_module": "@jupyter-widgets/base",
+            "_model_module_version": "1.2.0",
+            "_model_name": "LayoutModel",
+            "_view_count": null,
+            "_view_module": "@jupyter-widgets/base",
+            "_view_module_version": "1.2.0",
+            "_view_name": "LayoutView",
+            "align_content": null,
+            "align_items": null,
+            "align_self": null,
+            "border": null,
+            "bottom": null,
+            "display": null,
+            "flex": null,
+            "flex_flow": null,
+            "grid_area": null,
+            "grid_auto_columns": null,
+            "grid_auto_flow": null,
+            "grid_auto_rows": null,
+            "grid_column": null,
+            "grid_gap": null,
+            "grid_row": null,
+            "grid_template_areas": null,
+            "grid_template_columns": null,
+            "grid_template_rows": null,
+            "height": null,
+            "justify_content": null,
+            "justify_items": null,
+            "left": null,
+            "margin": null,
+            "max_height": null,
+            "max_width": null,
+            "min_height": null,
+            "min_width": null,
+            "object_fit": null,
+            "object_position": null,
+            "order": null,
+            "overflow": null,
+            "overflow_x": null,
+            "overflow_y": null,
+            "padding": null,
+            "right": null,
+            "top": null,
+            "visibility": null,
+            "width": null
+          }
+        },
+        "3b92d3a3cb0e4c19bf9fead4422a5554": {
+          "model_module": "@jupyter-widgets/base",
+          "model_name": "LayoutModel",
+          "model_module_version": "1.2.0",
+          "state": {
+            "_model_module": "@jupyter-widgets/base",
+            "_model_module_version": "1.2.0",
+            "_model_name": "LayoutModel",
+            "_view_count": null,
+            "_view_module": "@jupyter-widgets/base",
+            "_view_module_version": "1.2.0",
+            "_view_name": "LayoutView",
+            "align_content": null,
+            "align_items": null,
+            "align_self": null,
+            "border": null,
+            "bottom": null,
+            "display": null,
+            "flex": null,
+            "flex_flow": null,
+            "grid_area": null,
+            "grid_auto_columns": null,
+            "grid_auto_flow": null,
+            "grid_auto_rows": null,
+            "grid_column": null,
+            "grid_gap": null,
+            "grid_row": null,
+            "grid_template_areas": null,
+            "grid_template_columns": null,
+            "grid_template_rows": null,
+            "height": null,
+            "justify_content": null,
+            "justify_items": null,
+            "left": null,
+            "margin": null,
+            "max_height": null,
+            "max_width": null,
+            "min_height": null,
+            "min_width": null,
+            "object_fit": null,
+            "object_position": null,
+            "order": null,
+            "overflow": null,
+            "overflow_x": null,
+            "overflow_y": null,
+            "padding": null,
+            "right": null,
+            "top": null,
+            "visibility": null,
+            "width": null
+          }
+        },
+        "34ad0d26150f4b589149d4ece6bf877a": {
+          "model_module": "@jupyter-widgets/controls",
+          "model_name": "DescriptionStyleModel",
+          "model_module_version": "1.5.0",
+          "state": {
+            "_model_module": "@jupyter-widgets/controls",
+            "_model_module_version": "1.5.0",
+            "_model_name": "DescriptionStyleModel",
+            "_view_count": null,
+            "_view_module": "@jupyter-widgets/base",
+            "_view_module_version": "1.2.0",
+            "_view_name": "StyleView",
+            "description_width": ""
+          }
+        },
+        "d6a31ea687064ceebf54819168df219c": {
+          "model_module": "@jupyter-widgets/base",
+          "model_name": "LayoutModel",
+          "model_module_version": "1.2.0",
+          "state": {
+            "_model_module": "@jupyter-widgets/base",
+            "_model_module_version": "1.2.0",
+            "_model_name": "LayoutModel",
+            "_view_count": null,
+            "_view_module": "@jupyter-widgets/base",
+            "_view_module_version": "1.2.0",
+            "_view_name": "LayoutView",
+            "align_content": null,
+            "align_items": null,
+            "align_self": null,
+            "border": null,
+            "bottom": null,
+            "display": null,
+            "flex": null,
+            "flex_flow": null,
+            "grid_area": null,
+            "grid_auto_columns": null,
+            "grid_auto_flow": null,
+            "grid_auto_rows": null,
+            "grid_column": null,
+            "grid_gap": null,
+            "grid_row": null,
+            "grid_template_areas": null,
+            "grid_template_columns": null,
+            "grid_template_rows": null,
+            "height": null,
+            "justify_content": null,
+            "justify_items": null,
+            "left": null,
+            "margin": null,
+            "max_height": null,
+            "max_width": null,
+            "min_height": null,
+            "min_width": null,
+            "object_fit": null,
+            "object_position": null,
+            "order": null,
+            "overflow": null,
+            "overflow_x": null,
+            "overflow_y": null,
+            "padding": null,
+            "right": null,
+            "top": null,
+            "visibility": null,
+            "width": null
+          }
+        },
+        "f88fc2141c524793b3266f2ead90b42b": {
+          "model_module": "@jupyter-widgets/controls",
+          "model_name": "ProgressStyleModel",
+          "model_module_version": "1.5.0",
+          "state": {
+            "_model_module": "@jupyter-widgets/controls",
+            "_model_module_version": "1.5.0",
+            "_model_name": "ProgressStyleModel",
+            "_view_count": null,
+            "_view_module": "@jupyter-widgets/base",
+            "_view_module_version": "1.2.0",
+            "_view_name": "StyleView",
+            "bar_color": null,
+            "description_width": ""
+          }
+        },
+        "d868549bbf9241e8b20a922ad5426b3e": {
+          "model_module": "@jupyter-widgets/base",
+          "model_name": "LayoutModel",
+          "model_module_version": "1.2.0",
+          "state": {
+            "_model_module": "@jupyter-widgets/base",
+            "_model_module_version": "1.2.0",
+            "_model_name": "LayoutModel",
+            "_view_count": null,
+            "_view_module": "@jupyter-widgets/base",
+            "_view_module_version": "1.2.0",
+            "_view_name": "LayoutView",
+            "align_content": null,
+            "align_items": null,
+            "align_self": null,
+            "border": null,
+            "bottom": null,
+            "display": null,
+            "flex": null,
+            "flex_flow": null,
+            "grid_area": null,
+            "grid_auto_columns": null,
+            "grid_auto_flow": null,
+            "grid_auto_rows": null,
+            "grid_column": null,
+            "grid_gap": null,
+            "grid_row": null,
+            "grid_template_areas": null,
+            "grid_template_columns": null,
+            "grid_template_rows": null,
+            "height": null,
+            "justify_content": null,
+            "justify_items": null,
+            "left": null,
+            "margin": null,
+            "max_height": null,
+            "max_width": null,
+            "min_height": null,
+            "min_width": null,
+            "object_fit": null,
+            "object_position": null,
+            "order": null,
+            "overflow": null,
+            "overflow_x": null,
+            "overflow_y": null,
+            "padding": null,
+            "right": null,
+            "top": null,
+            "visibility": null,
+            "width": null
+          }
+        },
+        "4fa6dc96da8f49929249cc061a052167": {
+          "model_module": "@jupyter-widgets/controls",
+          "model_name": "DescriptionStyleModel",
+          "model_module_version": "1.5.0",
+          "state": {
+            "_model_module": "@jupyter-widgets/controls",
+            "_model_module_version": "1.5.0",
+            "_model_name": "DescriptionStyleModel",
+            "_view_count": null,
+            "_view_module": "@jupyter-widgets/base",
+            "_view_module_version": "1.2.0",
+            "_view_name": "StyleView",
+            "description_width": ""
+          }
+        }
+      }
+    }
+  },
+  "nbformat": 4,
+  "nbformat_minor": 0
+}
\ No newline at end of file