diff --git a/downstream_tasks/JAX_GPU_+_LongT5_XL_Q_4_bit_+_LoRA.ipynb b/downstream_tasks/JAX_GPU_+_LongT5_XL_Q_4_bit_+_LoRA.ipynb new file mode 100644 index 0000000..089a51a --- /dev/null +++ b/downstream_tasks/JAX_GPU_+_LongT5_XL_Q_4_bit_+_LoRA.ipynb @@ -0,0 +1,4426 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "metadata": { + "id": "PiSi90gspEQP" + }, + "source": [ + "# Easy GPT-Q + LoRA in JAX ([github](https://github.com/davisyoshida/easy-lora-and-gptq))\n", + "\n", + "[Davis Yoshida](https://github.com/davisyoshida/)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "hfxALa1so2JD" + }, + "source": [ + "This notebook shows how to combine two JAX tools/transforms I wrote: [Lorax](https://github.com/davisyoshida/lorax) and [JAX-GPTQ](https://github.com/davisyoshida/jax-gptq). I've been using the combination to run LLaMA finetunes on a single GPU.\n", + "\n", + "They're both applicable to basically any JAX function, which conveniently includes many HuggingFace models!\n", + "\n", + "The procedure is as follows:\n", + "\n", + "1. Quantize the weights of the model we want to use\n", + "2. Use Lorax to transform the original model function `F(params, inputs)` to one that takes a tuple of the original params and the low rank LoRA params: `F_lora(param_tuple, inputs)`\n", + "3. Wrap `F_lora` in `use_quantized` transform so that it knows how to handle arguments which are int8 matrices with two parameters per byte.\n", + "4. Train the model, updating only the low rank params and leaving the larger 4-bit model weights frozen.\n", + "\n", + "I'd love feedback on one or both of these tools so please let me know on their Githubs if you have any suggestions. JAX-GPTQ in particular is still in a really early state." + ] + }, + { + "cell_type": "markdown", + "source": [ + "####XLA Runtime OOM Prevention" + ], + "metadata": { + "id": "SYw-sN1-eX3n" + } + }, + { + "cell_type": "code", + "source": [ + "import os\n", + "\n", + "# Allocate 90% of the GPU memory to the XLA runtime\n", + "os.environ[\"XLA_PYTHON_CLIENT_MEM_FRACTION\"]=\".9\"\n", + "\n", + "# Disable preallocation of memory\n", + "os.environ[\"XLA_PYTHON_CLIENT_PREALLOCATE\"]=\"false\"\n", + "\n", + "# Use the platform allocator instead of the cuda allocator\n", + "os.environ[\"XLA_PYTHON_CLIENT_ALLOCATOR\"]=\"platform\"" + ], + "metadata": { + "id": "3DPHwXufeYGC" + }, + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "0Y6JeyF45yd_" + }, + "source": [ + "### Setup" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "collapsed": true, + "id": "ljjNpQvkrhsA", + "colab": { + "base_uri": "https://localhost:8080/" + }, + "outputId": "c132d295-da99-47cb-f7c9-7fb130ec6d9b" + }, + "outputs": [ + { + "output_type": "stream", + "name": "stdout", + "text": [ + "Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/\n", + "Collecting git+https://github.com/davisyoshida/jax-gptq.git\n", + " Cloning https://github.com/davisyoshida/jax-gptq.git to /tmp/pip-req-build-k_jo2l0c\n", + " Running command git clone --filter=blob:none --quiet https://github.com/davisyoshida/jax-gptq.git /tmp/pip-req-build-k_jo2l0c\n", + " Resolved https://github.com/davisyoshida/jax-gptq.git to commit 8b8ff0fd23b4a7732f1c5dca98d7275045194d3c\n", + " Preparing metadata (setup.py) ... \u001b[?25l\u001b[?25hdone\n", + "Building wheels for collected packages: jax-gptq\n", + " Building wheel for jax-gptq (setup.py) ... \u001b[?25l\u001b[?25hdone\n", + " Created wheel for jax-gptq: filename=jax_gptq-0.0.1-py3-none-any.whl size=16385 sha256=a2859bad302537b7f25b2bee3f4c1b5bbbb271b30821e6db4b595b038197e9e4\n", + " Stored in directory: /tmp/pip-ephem-wheel-cache-ck8sz67p/wheels/ff/5e/fb/dec939c953c916b7437c0ce0839617a79dc06e0a2fd85138a2\n", + "Successfully built jax-gptq\n", + "Installing collected packages: jax-gptq\n", + "Successfully installed jax-gptq-0.0.1\n", + "Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/\n", + "Collecting jax-lorax\n", + " Downloading jax_lorax-0.1.2-py3-none-any.whl (8.4 kB)\n", + "Requirement already satisfied: jax<0.5.0,>=0.4.6 in /usr/local/lib/python3.10/dist-packages (from jax-lorax) (0.4.10)\n", + "Requirement already satisfied: jaxlib<0.5.0,>=0.4.6 in /usr/local/lib/python3.10/dist-packages (from jax-lorax) (0.4.10+cuda11.cudnn86)\n", + "Requirement already satisfied: ml-dtypes>=0.1.0 in /usr/local/lib/python3.10/dist-packages (from jax<0.5.0,>=0.4.6->jax-lorax) (0.1.0)\n", + "Requirement already satisfied: numpy>=1.21 in /usr/local/lib/python3.10/dist-packages (from jax<0.5.0,>=0.4.6->jax-lorax) (1.22.4)\n", + "Requirement already satisfied: opt-einsum in /usr/local/lib/python3.10/dist-packages (from jax<0.5.0,>=0.4.6->jax-lorax) (3.3.0)\n", + "Requirement already satisfied: scipy>=1.7 in /usr/local/lib/python3.10/dist-packages (from jax<0.5.0,>=0.4.6->jax-lorax) (1.10.1)\n", + "Installing collected packages: jax-lorax\n", + "Successfully installed jax-lorax-0.1.2\n", + "Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/\n", + "Collecting transformers\n", + " Downloading transformers-4.29.2-py3-none-any.whl (7.1 MB)\n", + "\u001b[2K \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m7.1/7.1 MB\u001b[0m \u001b[31m37.2 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n", + "\u001b[?25hRequirement already satisfied: filelock in /usr/local/lib/python3.10/dist-packages (from transformers) (3.12.0)\n", + "Collecting huggingface-hub<1.0,>=0.14.1 (from transformers)\n", + " Downloading huggingface_hub-0.15.1-py3-none-any.whl (236 kB)\n", + "\u001b[2K \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m236.8/236.8 kB\u001b[0m \u001b[31m28.5 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n", + "\u001b[?25hRequirement already satisfied: numpy>=1.17 in /usr/local/lib/python3.10/dist-packages (from transformers) (1.22.4)\n", + "Requirement already satisfied: packaging>=20.0 in /usr/local/lib/python3.10/dist-packages (from transformers) (23.1)\n", + "Requirement already satisfied: pyyaml>=5.1 in /usr/local/lib/python3.10/dist-packages (from transformers) (6.0)\n", + "Requirement already satisfied: regex!=2019.12.17 in /usr/local/lib/python3.10/dist-packages (from transformers) (2022.10.31)\n", + "Requirement already satisfied: requests in /usr/local/lib/python3.10/dist-packages (from transformers) (2.27.1)\n", + "Collecting tokenizers!=0.11.3,<0.14,>=0.11.1 (from transformers)\n", + " Downloading tokenizers-0.13.3-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (7.8 MB)\n", + "\u001b[2K \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m7.8/7.8 MB\u001b[0m \u001b[31m93.4 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n", + "\u001b[?25hRequirement already satisfied: tqdm>=4.27 in /usr/local/lib/python3.10/dist-packages (from transformers) (4.65.0)\n", + "Requirement already satisfied: fsspec in /usr/local/lib/python3.10/dist-packages (from huggingface-hub<1.0,>=0.14.1->transformers) (2023.4.0)\n", + "Requirement already satisfied: typing-extensions>=3.7.4.3 in /usr/local/lib/python3.10/dist-packages (from huggingface-hub<1.0,>=0.14.1->transformers) (4.5.0)\n", + "Requirement already satisfied: urllib3<1.27,>=1.21.1 in /usr/local/lib/python3.10/dist-packages (from requests->transformers) (1.26.15)\n", + "Requirement already satisfied: certifi>=2017.4.17 in /usr/local/lib/python3.10/dist-packages (from requests->transformers) (2022.12.7)\n", + "Requirement already satisfied: charset-normalizer~=2.0.0 in /usr/local/lib/python3.10/dist-packages (from requests->transformers) (2.0.12)\n", + "Requirement already satisfied: idna<4,>=2.5 in /usr/local/lib/python3.10/dist-packages (from requests->transformers) (3.4)\n", + "Installing collected packages: tokenizers, huggingface-hub, transformers\n", + "Successfully installed huggingface-hub-0.15.1 tokenizers-0.13.3 transformers-4.29.2\n", + "Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/\n", + "Collecting accelerate\n", + " Downloading accelerate-0.19.0-py3-none-any.whl (219 kB)\n", + "\u001b[2K \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m219.1/219.1 kB\u001b[0m \u001b[31m6.6 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n", + "\u001b[?25hRequirement already satisfied: numpy>=1.17 in /usr/local/lib/python3.10/dist-packages (from accelerate) (1.22.4)\n", + "Requirement already satisfied: packaging>=20.0 in /usr/local/lib/python3.10/dist-packages (from accelerate) (23.1)\n", + "Requirement already satisfied: psutil in /usr/local/lib/python3.10/dist-packages (from accelerate) (5.9.5)\n", + "Requirement already satisfied: pyyaml in /usr/local/lib/python3.10/dist-packages (from accelerate) (6.0)\n", + "Requirement already satisfied: torch>=1.6.0 in /usr/local/lib/python3.10/dist-packages (from accelerate) (2.0.1+cu118)\n", + "Requirement already satisfied: filelock in /usr/local/lib/python3.10/dist-packages (from torch>=1.6.0->accelerate) (3.12.0)\n", + "Requirement already satisfied: typing-extensions in /usr/local/lib/python3.10/dist-packages (from torch>=1.6.0->accelerate) (4.5.0)\n", + "Requirement already satisfied: sympy in /usr/local/lib/python3.10/dist-packages (from torch>=1.6.0->accelerate) (1.11.1)\n", + "Requirement already satisfied: networkx in /usr/local/lib/python3.10/dist-packages (from torch>=1.6.0->accelerate) (3.1)\n", + "Requirement already satisfied: jinja2 in /usr/local/lib/python3.10/dist-packages (from torch>=1.6.0->accelerate) (3.1.2)\n", + "Requirement already satisfied: triton==2.0.0 in /usr/local/lib/python3.10/dist-packages (from torch>=1.6.0->accelerate) (2.0.0)\n", + "Requirement already satisfied: cmake in /usr/local/lib/python3.10/dist-packages (from triton==2.0.0->torch>=1.6.0->accelerate) (3.25.2)\n", + "Requirement already satisfied: lit in /usr/local/lib/python3.10/dist-packages (from triton==2.0.0->torch>=1.6.0->accelerate) (16.0.5)\n", + "Requirement already satisfied: MarkupSafe>=2.0 in /usr/local/lib/python3.10/dist-packages (from jinja2->torch>=1.6.0->accelerate) (2.1.2)\n", + "Requirement already satisfied: mpmath>=0.19 in /usr/local/lib/python3.10/dist-packages (from sympy->torch>=1.6.0->accelerate) (1.3.0)\n", + "Installing collected packages: accelerate\n", + "Successfully installed accelerate-0.19.0\n" + ] + } + ], + "source": [ + "!pip install git+https://github.com/davisyoshida/jax-gptq.git\n", + "!pip install jax-lorax\n", + "!pip install transformers\n", + "!pip install accelerate" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "id": "75-T_R0Ms9qD" + }, + "outputs": [], + "source": [ + "from functools import partial\n", + "import jax\n", + "import jax.numpy as jnp\n", + "import numpy as np\n", + "import optax\n", + "import torch\n", + "\n", + "import transformers\n", + "from transformers import (\n", + " CONFIG_MAPPING,\n", + " FLAX_MODEL_FOR_CAUSAL_LM_MAPPING,\n", + " AutoConfig,\n", + " AutoTokenizer,\n", + " FlaxAutoModelForCausalLM,\n", + " HfArgumentParser,\n", + " TrainingArguments,\n", + " is_tensorboard_available,\n", + ")\n", + "\n", + "from tqdm import trange\n", + "\n", + "import lorax\n", + "import jax_gptq\n", + "\n", + "gpu = jax.devices('gpu')[0]\n", + "cpu = jax.devices('cpu')[0]" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "GQuDSjz7svdL" + }, + "source": [ + "## Toy Example\n", + "\n", + "### Model/Data setup\n", + "\n", + "First we'll define an MLP and make some parameters for it:" + ] + }, + { + "cell_type": "code", + "source": [ + "from transformers import LongT5Config, FlaxT5ForConditionalGeneration\n", + "from transformers import AutoModelForCausalLM, AutoModelForSeq2SeqLM, AutoTokenizer\n", + "\n", + "from transformers import BitsAndBytesConfig\n", + "\n", + "\n", + "nf4_config = BitsAndBytesConfig(\n", + " load_in_4bit=True,\n", + " bnb_4bit_quant_type=\"nf4\",\n", + " bnb_4bit_use_double_quant=True,\n", + " bnb_4bit_compute_dtype=torch.bfloat16\n", + ")\n", + "\n", + "# Load the LongT5-XL model with its configuration\n", + "model_id = \"google/long-t5-tglobal-xl\"\n", + "config = LongT5Config.from_pretrained(model_id)\n", + "#model = AutoModelForSeq2SeqLM.from_pretrained(model_id, load_in_4bit=True, device_map=\"auto\")\n", + "model = AutoModelForSeq2SeqLM.from_pretrained(model_id, quantization_config=nf4_config)\n", + "tokenizer = AutoTokenizer.from_pretrained(model_id)" + ], + "metadata": { + "colab": { + "base_uri": "https://localhost:8080/", + "height": 305, + "referenced_widgets": [ + "1feaed28d930404d8684eb14f7f363c7", + "3b045a0e3ecf4c098d8fa8b0917108a5", + "ab35fcc175334e089b9abf95d91604c3", + "bc4dcd32917f4970843bdde97d849285", + "ba84ab17fb18462a82cd6ddd9581d8de", + "479284c4dab249d5b88b9f471e988392", + "98ba61d9a09b42629f5985a4b82041c5", + "0438654193164051b8eaf4e234299bf2", + "dd336b937eff4790a8b7ed0b5f8d211f", + "d86b4cb7e789448a93d78e597a9c45eb", + "3fbe401e5b084e66b567e6a597183f01", + "bd19569f52df4784ac7dbdfd159f5588", + "1314b9257b2047769a7c95408ddaddc3", + "196c4ac94eab47048c7de169ce8e12f1", + "9c40c0629bac4d3884a991153bb61a15", + "44411fd1e3254d97b05e3b536c5b4f3b", + "272afa7b903244e297c4904b19c01632", + "f6ef494723ad4b078cda848ec84bf621", + "646e4a2d4ed34cd2ab73cb598a011075", + "b4a84db21aa44c77bdbddbaabc3fc079", + "3313b927bd0a4546baf4ce6f6ebd5bcf", + "9b2914d7649a447a9e059d7a65600e44", + "1169578ce65440f695f8ec234c84a90d", + "a835304240c54c5db280426c5fb856c4", + "cd42bbbaf3eb4c4fb1ade3c314b2eaac", + "ebdcacb5d2094c878af40538fb2ec42c", + "3ead271f91ea4af3a64928754048020f", + "93e3146c97c4430a84524f9fc804a219", + "0708839b96b1495a9e2f15f0d1b78574", + "fd56b1d95eb04339b9b01ba691db4e81", + "701570dcfc2a4b8d99b2ea9c08057fe7", + "521d7df8efe849c0a43b7b2f2b10ba24", + "0da6f8caf9b34844bf763f7030c5e504", + "f99effa921ce42e2b62ca92d18d6d8d0", + "33889cada1fd42a3933eb4f1e923d7a4", + "0f30b08741a74eab83dd3689671f1269", + "fd73f715ee7548979c57df22efda5097", + "e44cf747982d41bb862843133b58f4b1", + "8c12d66a9488411986f8a0d16822ba87", + "df7a49e2d8b34643baf60560e1aa89fc", + "7974a3139aca4dabbc2bc57844121070", + "865b5e4cd3a2499da68eaaf5d47dfc95", + "6a96f4ad805142468ae84bf77ed41d3a", + "6a751bdf2f094739b4c3cba478625f84", + "c7f3959181c046c7b281c5af7d8c959c", + "a90ac03129fd40a2b5c9ae8d59dcdaef", + "20b6fae45ebd4badbf402c81e3d24d47", + "10973afff9364526b481f09d89e1d7e2", + "9d734a6342524c6dab9699468609a1d4", + "630542ff534b410dadf99e20bb6194c5", + "a2d9be02f65a47cfbec746acd4a3360c", + "c6948d90a1484452bc693646ad619666", + "01641210f615480f8c239feb99f0e323", + "0012771409a04059bb6eb85a6ad5eff2", + "430f70d9467041f8bea72b4a16e430de", + "88a9e1e6abdc439e9808df4235585137", + "6874e404870a407ab1672e15612712d8", + "80c4510fff3e44c58b52e314d4ce1174", + "45771a1396094a2e9f8bb6dc6800b908", + "b3b373bdfc26499dbf400667b2edc6b4", + "8c93c66cdf0a44d4ab9bcc4df497d1ec", + "cd379c6435394897a3906d4bc6b71345", + "5d198996c0344b8f92409b1cbff19af4", + "d965c2e080bd452a99954a3b3af2c147", + "b79886293822421895753aa0e2d2d170", + "f27d00a5f5964ee684787d97f132834a", + "2f214f129655487baff53148920c7c95", + "ef32e3942ca14835aaa1fcd1bfc80018", + "43d505d0a0294e4abd4878f53d7f7bc8", + "297ed97f4dbb43aba188a337e1d8bf26", + "0e2a8d04b5a143bc9bfa77f7b538954e", + "f99a48ae06114986aa4249d0876962a8", + "025cd7df6ab64885bb7939f28a28be28", + "71401278061d4f55af185007f7e21289", + "36e7f52522764ab9bd5669366c49a8ea", + "2d9b52f788d34d4d94b6e1b17f0a6ace", + "5c662a378c2a4529a0c742fb0a0210fe", + "db160865302c401facf54afd53f972d4", + "60d5faa861404744bd685257af4be791", + "de92bc34f9d04ad18e0bbfb8cfdc4e3b", + "077954e270e444caa52c3f6bbcf20817", + "8ecb7100390e4d0cb3425b173dcf07f1", + "12a0cb44ed93433ebbc9f6c80c4e494c", + "a7b74c6b00384f3f905e3fb8a8c3f216", + "c92cfb1a34af4d35ab4a1cbb552f1a8b", + "dea195394d484e6f96e097ac45eb926a", + "7a381e3cefcf48a1bc188c49985f2bf2", + "40e7ef29f37e414cb27fc6798ec3dd69", + "c1aadb4cc2754a56af360a1534dc9c07", + "b9874db4b125444c9c1e461ec982ab1d", + "e78eef1b3587420cb805c2899d3d50f5", + "a51b630132cc4f8dacd2b59666ca0640", + "e5c5b8a19d5b46269c203e313ac56ca2", + "3b92d3a3cb0e4c19bf9fead4422a5554", + "34ad0d26150f4b589149d4ece6bf877a", + "d6a31ea687064ceebf54819168df219c", + "f88fc2141c524793b3266f2ead90b42b", + "d868549bbf9241e8b20a922ad5426b3e", + "4fa6dc96da8f49929249cc061a052167" + ] + }, + "id": "YKcA0xmzRIas", + "outputId": "61ec2b74-04ab-4eb4-962d-8862a0441d1a" + }, + "execution_count": null, + "outputs": [ + { + "output_type": "display_data", + "data": { + "text/plain": [ + "Downloading (…)lve/main/config.json: 0%| | 0.00/896 [00:00\u001b[0m:\u001b[94m46\u001b[0m \u001b[31m│\u001b[0m\n", + "\u001b[31m│\u001b[0m \u001b[31m│\u001b[0m\n", + "\u001b[31m│\u001b[0m \u001b[2;33m/usr/local/lib/python3.10/dist-packages/optax/_src/\u001b[0m\u001b[1;33mcombine.py\u001b[0m:\u001b[94m50\u001b[0m in \u001b[92minit_fn\u001b[0m \u001b[31m│\u001b[0m\n", + "\u001b[31m│\u001b[0m \u001b[31m│\u001b[0m\n", + "\u001b[31m│\u001b[0m \u001b[2m 47 \u001b[0m\u001b[2m \u001b[0minit_fns, update_fns = \u001b[96mzip\u001b[0m(*transforms) \u001b[31m│\u001b[0m\n", + "\u001b[31m│\u001b[0m \u001b[2m 48 \u001b[0m\u001b[2m \u001b[0m \u001b[31m│\u001b[0m\n", + "\u001b[31m│\u001b[0m \u001b[2m 49 \u001b[0m\u001b[2m \u001b[0m\u001b[94mdef\u001b[0m \u001b[92minit_fn\u001b[0m(params): \u001b[31m│\u001b[0m\n", + "\u001b[31m│\u001b[0m \u001b[31m❱ \u001b[0m 50 \u001b[2m│ \u001b[0m\u001b[94mreturn\u001b[0m \u001b[96mtuple\u001b[0m(fn(params) \u001b[94mfor\u001b[0m fn \u001b[95min\u001b[0m init_fns) \u001b[31m│\u001b[0m\n", + "\u001b[31m│\u001b[0m \u001b[2m 51 \u001b[0m\u001b[2m \u001b[0m \u001b[31m│\u001b[0m\n", + "\u001b[31m│\u001b[0m \u001b[2m 52 \u001b[0m\u001b[2m \u001b[0m\u001b[94mdef\u001b[0m \u001b[92mupdate_fn\u001b[0m(updates, state, params=\u001b[94mNone\u001b[0m, **extra_args): \u001b[31m│\u001b[0m\n", + "\u001b[31m│\u001b[0m \u001b[2m 53 \u001b[0m\u001b[2m│ \u001b[0m\u001b[94mif\u001b[0m \u001b[96mlen\u001b[0m(update_fns) != \u001b[96mlen\u001b[0m(state): \u001b[31m│\u001b[0m\n", + "\u001b[31m│\u001b[0m \u001b[31m│\u001b[0m\n", + "\u001b[31m│\u001b[0m \u001b[2;33m/usr/local/lib/python3.10/dist-packages/optax/_src/\u001b[0m\u001b[1;33mcombine.py\u001b[0m:\u001b[94m50\u001b[0m in \u001b[92m\u001b[0m \u001b[31m│\u001b[0m\n", + "\u001b[31m│\u001b[0m \u001b[31m│\u001b[0m\n", + "\u001b[31m│\u001b[0m \u001b[2m 47 \u001b[0m\u001b[2m \u001b[0minit_fns, update_fns = \u001b[96mzip\u001b[0m(*transforms) \u001b[31m│\u001b[0m\n", + "\u001b[31m│\u001b[0m \u001b[2m 48 \u001b[0m\u001b[2m \u001b[0m \u001b[31m│\u001b[0m\n", + "\u001b[31m│\u001b[0m \u001b[2m 49 \u001b[0m\u001b[2m \u001b[0m\u001b[94mdef\u001b[0m \u001b[92minit_fn\u001b[0m(params): \u001b[31m│\u001b[0m\n", + "\u001b[31m│\u001b[0m \u001b[31m❱ \u001b[0m 50 \u001b[2m│ \u001b[0m\u001b[94mreturn\u001b[0m \u001b[96mtuple\u001b[0m(fn(params) \u001b[94mfor\u001b[0m fn \u001b[95min\u001b[0m init_fns) \u001b[31m│\u001b[0m\n", + "\u001b[31m│\u001b[0m \u001b[2m 51 \u001b[0m\u001b[2m \u001b[0m \u001b[31m│\u001b[0m\n", + "\u001b[31m│\u001b[0m \u001b[2m 52 \u001b[0m\u001b[2m \u001b[0m\u001b[94mdef\u001b[0m \u001b[92mupdate_fn\u001b[0m(updates, state, params=\u001b[94mNone\u001b[0m, **extra_args): \u001b[31m│\u001b[0m\n", + "\u001b[31m│\u001b[0m \u001b[2m 53 \u001b[0m\u001b[2m│ \u001b[0m\u001b[94mif\u001b[0m \u001b[96mlen\u001b[0m(update_fns) != \u001b[96mlen\u001b[0m(state): \u001b[31m│\u001b[0m\n", + "\u001b[31m│\u001b[0m \u001b[31m│\u001b[0m\n", + "\u001b[31m│\u001b[0m \u001b[2;33m/usr/local/lib/python3.10/dist-packages/optax/_src/\u001b[0m\u001b[1;33mtransform.py\u001b[0m:\u001b[94m335\u001b[0m in \u001b[92minit_fn\u001b[0m \u001b[31m│\u001b[0m\n", + "\u001b[31m│\u001b[0m \u001b[31m│\u001b[0m\n", + "\u001b[31m│\u001b[0m \u001b[2m 332 \u001b[0m\u001b[2m \u001b[0mmu_dtype = utils.canonicalize_dtype(mu_dtype) \u001b[31m│\u001b[0m\n", + "\u001b[31m│\u001b[0m \u001b[2m 333 \u001b[0m\u001b[2m \u001b[0m \u001b[31m│\u001b[0m\n", + "\u001b[31m│\u001b[0m \u001b[2m 334 \u001b[0m\u001b[2m \u001b[0m\u001b[94mdef\u001b[0m \u001b[92minit_fn\u001b[0m(params): \u001b[31m│\u001b[0m\n", + "\u001b[31m│\u001b[0m \u001b[31m❱ \u001b[0m 335 \u001b[2m│ \u001b[0mmu = jax.tree_util.tree_map( \u001b[2m# First moment\u001b[0m \u001b[31m│\u001b[0m\n", + "\u001b[31m│\u001b[0m \u001b[2m 336 \u001b[0m\u001b[2m│ │ \u001b[0m\u001b[94mlambda\u001b[0m t: jnp.zeros_like(t, dtype=mu_dtype), params) \u001b[31m│\u001b[0m\n", + "\u001b[31m│\u001b[0m \u001b[2m 337 \u001b[0m\u001b[2m│ \u001b[0mnu = jax.tree_util.tree_map(jnp.zeros_like, params) \u001b[2m# Second moment\u001b[0m \u001b[31m│\u001b[0m\n", + "\u001b[31m│\u001b[0m \u001b[2m 338 \u001b[0m\u001b[2m│ \u001b[0m\u001b[94mreturn\u001b[0m ScaleByAdamState(count=jnp.zeros([], jnp.int32), mu=mu, nu=nu) \u001b[31m│\u001b[0m\n", + "\u001b[31m│\u001b[0m \u001b[31m│\u001b[0m\n", + "\u001b[31m│\u001b[0m \u001b[2;33m/usr/local/lib/python3.10/dist-packages/jax/_src/\u001b[0m\u001b[1;33mtree_util.py\u001b[0m:\u001b[94m210\u001b[0m in \u001b[92mtree_map\u001b[0m \u001b[31m│\u001b[0m\n", + "\u001b[31m│\u001b[0m \u001b[31m│\u001b[0m\n", + "\u001b[31m│\u001b[0m \u001b[2m207 \u001b[0m\u001b[2;33m \u001b[0m\u001b[33m\"\"\"\u001b[0m \u001b[31m│\u001b[0m\n", + "\u001b[31m│\u001b[0m \u001b[2m208 \u001b[0m\u001b[2m \u001b[0mleaves, treedef = tree_flatten(tree, is_leaf) \u001b[31m│\u001b[0m\n", + "\u001b[31m│\u001b[0m \u001b[2m209 \u001b[0m\u001b[2m \u001b[0mall_leaves = [leaves] + [treedef.flatten_up_to(r) \u001b[94mfor\u001b[0m r \u001b[95min\u001b[0m rest] \u001b[31m│\u001b[0m\n", + "\u001b[31m│\u001b[0m \u001b[31m❱ \u001b[0m210 \u001b[2m \u001b[0m\u001b[94mreturn\u001b[0m treedef.unflatten(f(*xs) \u001b[94mfor\u001b[0m xs \u001b[95min\u001b[0m \u001b[96mzip\u001b[0m(*all_leaves)) \u001b[31m│\u001b[0m\n", + "\u001b[31m│\u001b[0m \u001b[2m211 \u001b[0m \u001b[31m│\u001b[0m\n", + "\u001b[31m│\u001b[0m \u001b[2m212 \u001b[0m\u001b[94mdef\u001b[0m \u001b[92mbuild_tree\u001b[0m(treedef: PyTreeDef, xs: Any) -> Any: \u001b[31m│\u001b[0m\n", + "\u001b[31m│\u001b[0m \u001b[2m213 \u001b[0m\u001b[2m \u001b[0m\u001b[94mreturn\u001b[0m treedef.from_iterable_tree(xs) \u001b[31m│\u001b[0m\n", + "\u001b[31m│\u001b[0m \u001b[31m│\u001b[0m\n", + "\u001b[31m│\u001b[0m \u001b[2;33m/usr/local/lib/python3.10/dist-packages/jax/_src/\u001b[0m\u001b[1;33mtree_util.py\u001b[0m:\u001b[94m210\u001b[0m in \u001b[92m\u001b[0m \u001b[31m│\u001b[0m\n", + "\u001b[31m│\u001b[0m \u001b[31m│\u001b[0m\n", + "\u001b[31m│\u001b[0m \u001b[2m207 \u001b[0m\u001b[2;33m \u001b[0m\u001b[33m\"\"\"\u001b[0m \u001b[31m│\u001b[0m\n", + "\u001b[31m│\u001b[0m \u001b[2m208 \u001b[0m\u001b[2m \u001b[0mleaves, treedef = tree_flatten(tree, is_leaf) \u001b[31m│\u001b[0m\n", + "\u001b[31m│\u001b[0m \u001b[2m209 \u001b[0m\u001b[2m \u001b[0mall_leaves = [leaves] + [treedef.flatten_up_to(r) \u001b[94mfor\u001b[0m r \u001b[95min\u001b[0m rest] \u001b[31m│\u001b[0m\n", + "\u001b[31m│\u001b[0m \u001b[31m❱ \u001b[0m210 \u001b[2m \u001b[0m\u001b[94mreturn\u001b[0m treedef.unflatten(f(*xs) \u001b[94mfor\u001b[0m xs \u001b[95min\u001b[0m \u001b[96mzip\u001b[0m(*all_leaves)) \u001b[31m│\u001b[0m\n", + "\u001b[31m│\u001b[0m \u001b[2m211 \u001b[0m \u001b[31m│\u001b[0m\n", + "\u001b[31m│\u001b[0m \u001b[2m212 \u001b[0m\u001b[94mdef\u001b[0m \u001b[92mbuild_tree\u001b[0m(treedef: PyTreeDef, xs: Any) -> Any: \u001b[31m│\u001b[0m\n", + "\u001b[31m│\u001b[0m \u001b[2m213 \u001b[0m\u001b[2m \u001b[0m\u001b[94mreturn\u001b[0m treedef.from_iterable_tree(xs) \u001b[31m│\u001b[0m\n", + "\u001b[31m│\u001b[0m \u001b[31m│\u001b[0m\n", + "\u001b[31m│\u001b[0m \u001b[2;33m/usr/local/lib/python3.10/dist-packages/optax/_src/\u001b[0m\u001b[1;33mtransform.py\u001b[0m:\u001b[94m336\u001b[0m in \u001b[92m\u001b[0m \u001b[31m│\u001b[0m\n", + "\u001b[31m│\u001b[0m \u001b[31m│\u001b[0m\n", + "\u001b[31m│\u001b[0m \u001b[2m 333 \u001b[0m\u001b[2m \u001b[0m \u001b[31m│\u001b[0m\n", + "\u001b[31m│\u001b[0m \u001b[2m 334 \u001b[0m\u001b[2m \u001b[0m\u001b[94mdef\u001b[0m \u001b[92minit_fn\u001b[0m(params): \u001b[31m│\u001b[0m\n", + "\u001b[31m│\u001b[0m \u001b[2m 335 \u001b[0m\u001b[2m│ \u001b[0mmu = jax.tree_util.tree_map( \u001b[2m# First moment\u001b[0m \u001b[31m│\u001b[0m\n", + "\u001b[31m│\u001b[0m \u001b[31m❱ \u001b[0m 336 \u001b[2m│ │ \u001b[0m\u001b[94mlambda\u001b[0m t: jnp.zeros_like(t, dtype=mu_dtype), params) \u001b[31m│\u001b[0m\n", + "\u001b[31m│\u001b[0m \u001b[2m 337 \u001b[0m\u001b[2m│ \u001b[0mnu = jax.tree_util.tree_map(jnp.zeros_like, params) \u001b[2m# Second moment\u001b[0m \u001b[31m│\u001b[0m\n", + "\u001b[31m│\u001b[0m \u001b[2m 338 \u001b[0m\u001b[2m│ \u001b[0m\u001b[94mreturn\u001b[0m ScaleByAdamState(count=jnp.zeros([], jnp.int32), mu=mu, nu=nu) \u001b[31m│\u001b[0m\n", + "\u001b[31m│\u001b[0m \u001b[2m 339 \u001b[0m \u001b[31m│\u001b[0m\n", + "\u001b[31m│\u001b[0m \u001b[31m│\u001b[0m\n", + "\u001b[31m│\u001b[0m \u001b[2;33m/usr/local/lib/python3.10/dist-packages/jax/_src/numpy/\u001b[0m\u001b[1;33mlax_numpy.py\u001b[0m:\u001b[94m2054\u001b[0m in \u001b[92mzeros_like\u001b[0m \u001b[31m│\u001b[0m\n", + "\u001b[31m│\u001b[0m \u001b[31m│\u001b[0m\n", + "\u001b[31m│\u001b[0m \u001b[2m2051 \u001b[0m\u001b[1;95m@util\u001b[0m._wraps(np.zeros_like) \u001b[31m│\u001b[0m\n", + "\u001b[31m│\u001b[0m \u001b[2m2052 \u001b[0m\u001b[94mdef\u001b[0m \u001b[92mzeros_like\u001b[0m(a: ArrayLike, dtype: Optional[DTypeLike] = \u001b[94mNone\u001b[0m, \u001b[31m│\u001b[0m\n", + "\u001b[31m│\u001b[0m \u001b[2m2053 \u001b[0m\u001b[2m│ │ │ \u001b[0mshape: Any = \u001b[94mNone\u001b[0m) -> Array: \u001b[31m│\u001b[0m\n", + "\u001b[31m│\u001b[0m \u001b[31m❱ \u001b[0m2054 \u001b[2m \u001b[0mutil.check_arraylike(\u001b[33m\"\u001b[0m\u001b[33mzeros_like\u001b[0m\u001b[33m\"\u001b[0m, a) \u001b[31m│\u001b[0m\n", + "\u001b[31m│\u001b[0m \u001b[2m2055 \u001b[0m\u001b[2m \u001b[0mdtypes.check_user_dtype_supported(dtype, \u001b[33m\"\u001b[0m\u001b[33mzeros_like\u001b[0m\u001b[33m\"\u001b[0m) \u001b[31m│\u001b[0m\n", + "\u001b[31m│\u001b[0m \u001b[2m2056 \u001b[0m\u001b[2m \u001b[0m\u001b[94mif\u001b[0m shape \u001b[95mis\u001b[0m \u001b[95mnot\u001b[0m \u001b[94mNone\u001b[0m: \u001b[31m│\u001b[0m\n", + "\u001b[31m│\u001b[0m \u001b[2m2057 \u001b[0m\u001b[2m│ \u001b[0mshape = canonicalize_shape(shape) \u001b[31m│\u001b[0m\n", + "\u001b[31m│\u001b[0m \u001b[31m│\u001b[0m\n", + "\u001b[31m│\u001b[0m \u001b[2;33m/usr/local/lib/python3.10/dist-packages/jax/_src/numpy/\u001b[0m\u001b[1;33mutil.py\u001b[0m:\u001b[94m328\u001b[0m in \u001b[92mcheck_arraylike\u001b[0m \u001b[31m│\u001b[0m\n", + "\u001b[31m│\u001b[0m \u001b[31m│\u001b[0m\n", + "\u001b[31m│\u001b[0m \u001b[2m325 \u001b[0m\u001b[2m│ \u001b[0mpos, arg = \u001b[96mnext\u001b[0m((i, arg) \u001b[94mfor\u001b[0m i, arg \u001b[95min\u001b[0m \u001b[96menumerate\u001b[0m(args) \u001b[31m│\u001b[0m\n", + "\u001b[31m│\u001b[0m \u001b[2m326 \u001b[0m\u001b[2m│ │ │ │ │ \u001b[0m\u001b[94mif\u001b[0m \u001b[95mnot\u001b[0m _arraylike(arg)) \u001b[31m│\u001b[0m\n", + "\u001b[31m│\u001b[0m \u001b[2m327 \u001b[0m\u001b[2m│ \u001b[0mmsg = \u001b[33m\"\u001b[0m\u001b[33m{}\u001b[0m\u001b[33m requires ndarray or scalar arguments, got \u001b[0m\u001b[33m{}\u001b[0m\u001b[33m at position \u001b[0m\u001b[33m{}\u001b[0m\u001b[33m.\u001b[0m\u001b[33m\"\u001b[0m \u001b[31m│\u001b[0m\n", + "\u001b[31m│\u001b[0m \u001b[31m❱ \u001b[0m328 \u001b[2m│ \u001b[0m\u001b[94mraise\u001b[0m \u001b[96mTypeError\u001b[0m(msg.format(fun_name, \u001b[96mtype\u001b[0m(arg), pos)) \u001b[31m│\u001b[0m\n", + "\u001b[31m│\u001b[0m \u001b[2m329 \u001b[0m \u001b[31m│\u001b[0m\n", + "\u001b[31m│\u001b[0m \u001b[2m330 \u001b[0m \u001b[31m│\u001b[0m\n", + "\u001b[31m│\u001b[0m \u001b[2m331 \u001b[0m\u001b[94mdef\u001b[0m \u001b[92mcheck_arraylike_or_none\u001b[0m(fun_name: \u001b[96mstr\u001b[0m, *args: Any): \u001b[31m│\u001b[0m\n", + "\u001b[31m╰──────────────────────────────────────────────────────────────────────────────────────────────────╯\u001b[0m\n", + "\u001b[1;91mTypeError: \u001b[0mzeros_like requires ndarray or scalar arguments, got \u001b[1m<\u001b[0m\u001b[1;95mclass\u001b[0m\u001b[39m \u001b[0m\u001b[32m'generator'\u001b[0m\u001b[1m>\u001b[0m at position \u001b[1;36m0\u001b[0m.\n" + ], + "text/html": [ + "
╭─────────────────────────────── Traceback (most recent call last) ────────────────────────────────╮\n",
+              " in <cell line: 46>:46                                                                            \n",
+              "                                                                                                  \n",
+              " /usr/local/lib/python3.10/dist-packages/optax/_src/combine.py:50 in init_fn                      \n",
+              "                                                                                                  \n",
+              "    47   init_fns, update_fns = zip(*transforms)                                                  \n",
+              "    48                                                                                            \n",
+              "    49   def init_fn(params):                                                                     \n",
+              "  50 return tuple(fn(params) for fn in init_fns)                                            \n",
+              "    51                                                                                            \n",
+              "    52   def update_fn(updates, state, params=None, **extra_args):                                \n",
+              "    53 │   if len(update_fns) != len(state):                                                      \n",
+              "                                                                                                  \n",
+              " /usr/local/lib/python3.10/dist-packages/optax/_src/combine.py:50 in <genexpr>                    \n",
+              "                                                                                                  \n",
+              "    47   init_fns, update_fns = zip(*transforms)                                                  \n",
+              "    48                                                                                            \n",
+              "    49   def init_fn(params):                                                                     \n",
+              "  50 return tuple(fn(params) for fn in init_fns)                                            \n",
+              "    51                                                                                            \n",
+              "    52   def update_fn(updates, state, params=None, **extra_args):                                \n",
+              "    53 │   if len(update_fns) != len(state):                                                      \n",
+              "                                                                                                  \n",
+              " /usr/local/lib/python3.10/dist-packages/optax/_src/transform.py:335 in init_fn                   \n",
+              "                                                                                                  \n",
+              "    332   mu_dtype = utils.canonicalize_dtype(mu_dtype)                                           \n",
+              "    333                                                                                           \n",
+              "    334   def init_fn(params):                                                                    \n",
+              "  335 mu = jax.tree_util.tree_map(  # First moment                                          \n",
+              "    336 │   │   lambda t: jnp.zeros_like(t, dtype=mu_dtype), params)                              \n",
+              "    337 │   nu = jax.tree_util.tree_map(jnp.zeros_like, params)  # Second moment                  \n",
+              "    338 │   return ScaleByAdamState(count=jnp.zeros([], jnp.int32), mu=mu, nu=nu)                 \n",
+              "                                                                                                  \n",
+              " /usr/local/lib/python3.10/dist-packages/jax/_src/tree_util.py:210 in tree_map                    \n",
+              "                                                                                                  \n",
+              "   207   \"\"\"                                                                                      \n",
+              "   208   leaves, treedef = tree_flatten(tree, is_leaf)                                            \n",
+              "   209   all_leaves = [leaves] + [treedef.flatten_up_to(r) for r in rest]                         \n",
+              " 210   return treedef.unflatten(f(*xs) for xs in zip(*all_leaves))                              \n",
+              "   211                                                                                            \n",
+              "   212 def build_tree(treedef: PyTreeDef, xs: Any) -> Any:                                        \n",
+              "   213   return treedef.from_iterable_tree(xs)                                                    \n",
+              "                                                                                                  \n",
+              " /usr/local/lib/python3.10/dist-packages/jax/_src/tree_util.py:210 in <genexpr>                   \n",
+              "                                                                                                  \n",
+              "   207   \"\"\"                                                                                      \n",
+              "   208   leaves, treedef = tree_flatten(tree, is_leaf)                                            \n",
+              "   209   all_leaves = [leaves] + [treedef.flatten_up_to(r) for r in rest]                         \n",
+              " 210   return treedef.unflatten(f(*xs) for xs in zip(*all_leaves))                              \n",
+              "   211                                                                                            \n",
+              "   212 def build_tree(treedef: PyTreeDef, xs: Any) -> Any:                                        \n",
+              "   213   return treedef.from_iterable_tree(xs)                                                    \n",
+              "                                                                                                  \n",
+              " /usr/local/lib/python3.10/dist-packages/optax/_src/transform.py:336 in <lambda>                  \n",
+              "                                                                                                  \n",
+              "    333                                                                                           \n",
+              "    334   def init_fn(params):                                                                    \n",
+              "    335 │   mu = jax.tree_util.tree_map(  # First moment                                          \n",
+              "  336 │   │   lambda t: jnp.zeros_like(t, dtype=mu_dtype), params)                              \n",
+              "    337 │   nu = jax.tree_util.tree_map(jnp.zeros_like, params)  # Second moment                  \n",
+              "    338 │   return ScaleByAdamState(count=jnp.zeros([], jnp.int32), mu=mu, nu=nu)                 \n",
+              "    339                                                                                           \n",
+              "                                                                                                  \n",
+              " /usr/local/lib/python3.10/dist-packages/jax/_src/numpy/lax_numpy.py:2054 in zeros_like           \n",
+              "                                                                                                  \n",
+              "   2051 @util._wraps(np.zeros_like)                                                               \n",
+              "   2052 def zeros_like(a: ArrayLike, dtype: Optional[DTypeLike] = None,                           \n",
+              "   2053 │   │   │      shape: Any = None) -> Array:                                               \n",
+              " 2054   util.check_arraylike(\"zeros_like\", a)                                                   \n",
+              "   2055   dtypes.check_user_dtype_supported(dtype, \"zeros_like\")                                  \n",
+              "   2056   if shape is not None:                                                                   \n",
+              "   2057 │   shape = canonicalize_shape(shape)                                                     \n",
+              "                                                                                                  \n",
+              " /usr/local/lib/python3.10/dist-packages/jax/_src/numpy/util.py:328 in check_arraylike            \n",
+              "                                                                                                  \n",
+              "   325 │   pos, arg = next((i, arg) for i, arg in enumerate(args)                                 \n",
+              "   326 │   │   │   │   │   if not _arraylike(arg))                                                \n",
+              "   327 │   msg = \"{} requires ndarray or scalar arguments, got {} at position {}.\"                \n",
+              " 328 raise TypeError(msg.format(fun_name, type(arg), pos))                                  \n",
+              "   329                                                                                            \n",
+              "   330                                                                                            \n",
+              "   331 def check_arraylike_or_none(fun_name: str, *args: Any):                                    \n",
+              "╰──────────────────────────────────────────────────────────────────────────────────────────────────╯\n",
+              "TypeError: zeros_like requires ndarray or scalar arguments, got <class 'generator'> at position 0.\n",
+              "
\n" + ] + }, + "metadata": {} + } + ], + "source": [ + "# Initialize the model parameters using JAX's PRNG key\n", + "rng_key = jax.random.PRNGKey(0)\n", + "input_ids = jnp.array([[1, 2, 3, 4, 5]])\n", + "decoder_input_ids = jnp.array([[1, 2, 3, 4, 5]])\n", + "params = model.parameters() # Returns an iterable over the parameters \n", + "#params = model.named_parameters() # Returns an iterable over the parameters and their names\n", + "\n", + "# Modify my_model to use the LongT5-XL model instead of the custom model defined earlier\n", + "def my_model(params, x):\n", + " logits = model(input_ids=x, params=params, train=True).logits\n", + " return jnp.mean(logits)\n", + "\n", + "# Define a loss function for the LongT5-XL model\n", + "@jax.jit\n", + "def compute_loss(params, input_ids, decoder_input_ids, labels):\n", + " logits = model(\n", + " input_ids=input_ids, \n", + " decoder_input_ids=decoder_input_ids, \n", + " params=params, \n", + " train=True\n", + " ).logits\n", + "\n", + "# Transform the loss function to get the gradients\n", + "grad_fn = jax.value_and_grad(compute_loss)\n", + "\n", + "# Define an optimizer to update the parameters using the gradients\n", + "optimizer = optax.adam(learning_rate=1e-3)\n", + "\n", + "# Define a train step function which combines the loss function and optimizer update, does the forward and backward pass, and returns the updated parameters\n", + "@jax.jit\n", + "def train_step(params, x, y, optimizer):\n", + " grads, loss = grad_fn(params, x, y)\n", + " updates, optimizer_state = optimizer.update(grads, optimizer_state)\n", + " new_params = optax.apply_updates(params, updates)\n", + " return new_params, loss, optimizer_state\n", + "\n", + "# Define a batch generator function using get_batches() from stackoverflow.com\n", + "def generate_batch(batch_size, rng, DIM=512):\n", + " # Generate a batch of input-output pairs\n", + " X_batch = list(jax.random.normal(rng, (batch_size, DIM)))\n", + " Y_batch = jax.random.randint(rng, (batch_size,), 0, 2, dtype=jnp.int32)\n", + "\n", + " return X_batch, Y_batch\n", + "\n", + "# Initialize the optimizer state and the PRNG key\n", + "optimizer_state = optimizer.init(params)\n", + "rng = jax.random.PRNGKey(0)\n", + "\n", + "# Train the model\n", + "num_steps = 50\n", + "batch_size = 4\n", + "\n", + "for i in range(num_steps):\n", + " # Generate a batch of input-output pairs\n", + " x_batch, y_batch = generate_batch(batch_size, rng)\n", + " \n", + " # Update the parameters and optimizer state\n", + " params, loss, optimizer_state = train_step(params, x_batch, y_batch, optimizer_state)\n", + " \n", + " # Print the loss every 10 steps\n", + " if i % 10 == 0:\n", + " print(f'Step {i}, Loss: {loss}') \n" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "RlCLAmjBvhnA" + }, + "source": [ + "GPT-Q needs input data for quantization. For an actual model we'd use real data but here we'll just make some random inputs." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "id": "6govTMOZvgSC" + }, + "outputs": [], + "source": [ + "quant_data = [jax.random.normal(key, (batch_size, DIM)) for key in jax.random.split(data_key, 64)]\n", + "\n", + "# We'll save an output for later comparison since the quantization process will delete the original params\n", + "original_output = my_model(params, quant_data[0])" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "Rjdb3h46vtsi" + }, + "source": [ + "### Run GPT-Q to get the quantized weights\n", + "That's all for the setup, we can now just run GPT-Q (without any changes to the original model code):" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "collapsed": true, + "id": "L1Mw9ZLpvrLa" + }, + "outputs": [], + "source": [ + "# Note that this may free the buffers associated with some or all of the parameters and the data to save VRAM\n", + "# I'd also recommend you put the params on the CPU, since `quantize()` will move the params to th GPU when necessary\n", + "quantized_params = jax_gptq.quantize(my_model, params, quant_data)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "2NhVv8egwDQu" + }, + "source": [ + "The matrices have been quantized but the biases have been left alone:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "id": "bWwXzTJyubbH" + }, + "outputs": [], + "source": [ + " print(f'W type: {type(quantized_params[0][\"w\"])}')\n", + " print(f'B type: {type(quantized_params[0][\"b\"])}')" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "QwYLTr6WwapB" + }, + "source": [ + "**Note**: The quantization procedure depends on the parameter being used in a matrix multiplication. Currently JAX-GPTQ supports general dot operations (including ones using tensors with any number of dimensions larger than 1), and convolutions with kernels of spatial size 1.\n", + "\n", + "### Applying the quantized weights\n", + "We can now run the quantized model without any code changes. All that's necessary is using `jax_gptq.use_quantized` to transform the function so it knows how to handle `QuantizedMatrix` values." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "id": "I6aLdXqawQFs" + }, + "outputs": [], + "source": [ + "quantized_params = jax.device_put(quantized_params, gpu) # Move the params to the GPU\n", + "\n", + "# Originally:\n", + "# my_model(params, inputs)\n", + "# After:\n", + "# jax_gptq(my_model)(params, inputs)\n", + "quant_output = jax_gptq.use_quantized(my_model)(quantized_params, quant_data[0])\n", + "\n", + "print(f'Output of quantized network: {quant_output:.3e}')\n", + "print(f'Original output: {original_output:.3e}')" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "1vXkTTctx7Vo" + }, + "source": [ + "### Train with LoRA\n", + "\n", + "Now that we've compressed our model to 4-bits (and change) per parameter, we can add full precision LoRA parameters for finetuning.\n", + "\n", + "The one gotcha about combining the two is that Lorax doesn't know that QuantizedMatrix values are pytree leaves, so you need to give the Lorax functions an `is_leaf` predicate." + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "l95MirHdzNo9" + }, + "source": [ + "**Initialization:** The `init_lora` function expects a pytree describing which parameters should get LoRA parameters, which should be fully trained, and which should be left frozen. `lorax.simple_spec` is a helper function for making these specs." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "id": "HKkhcjx9zJy6" + }, + "outputs": [], + "source": [ + "def is_leaf(x):\n", + " return isinstance(x, jax_gptq.QuantizedMatrix)\n", + "\n", + "lora_spec = lorax.simple_spec(\n", + " params=quantized_params,\n", + " decision_fn=lambda pytree_path, arr: 4, # Just ignore the inputs and specify an inner rank of 4 for all params\n", + " tune_vectors=False, # Tell Lorax to put all the biases in the frozen params tree instead of the tunable params tree\n", + " is_leaf=is_leaf\n", + ")\n", + "\n", + "# Lorax splits the parameters into two pytrees:\n", + "# freeze_params: Anything which received the value lorax.LORA_FREEZE in the spec\n", + "# train_params: Pairs of two narrow matrices for values which got positive integers as spec values, or the full parameter if the value lorax.LORA_FULL was in the spec\n", + "freeze_params, train_params = lorax.init_lora(quantized_params, lora_spec, jax.random.PRNGKey(1234), is_leaf=is_leaf)\n", + "\n", + "def merge_quantized_with_lora(q_params, lora_freeze):\n", + " return jax.tree_map(\n", + " lambda quant, from_lora: quant if isinstance(quant, jax_gptq.QuantizedMatrix) else from_lora,\n", + " q_params,\n", + " lora_freeze,\n", + " is_leaf=lambda x: isinstance(x, jax_gptq.QuantizedMatrix) # Tell tree_map to treat QuantizedMatrix as a single value instead of a non-leaf node\n", + " )\n", + "# Now we put the actual quantized params back\n", + "#freeze_params = merge_quantized_with_lora(quantized_params, freeze_params)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "-ebT9GXp16v4" + }, + "source": [ + "The `lorax.lora` transform converts a function from expecting a single pytree in the specified argument to expecting a tuple of two pytrees. It composes with other JAX transforms such as `jax_gptq.use_quantized`, so we can use both at once with no modifications to our model code." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "id": "1XjjuQcq1oSq" + }, + "outputs": [], + "source": [ + "combined_params = (freeze_params, train_params)\n", + "\n", + "my_model_with_lora_and_quantized_weights = jax_gptq.use_quantized(lorax.lora(my_model))\n", + "\n", + "# The differences from the original `my_model` function are:\n", + "# 1. The params argument now expects a tuple of (frozen_params, trainable_params)\n", + "# 2. It knows how to compute with quantized weights\n", + "quantized_plus_lorax_output = my_model_with_lora_and_quantized_weights(combined_params, quant_data[0])\n", + "\n", + "print(f'GPTQ + Lorax output: {quantized_plus_lorax_output:.3e}')\n", + "print(f'GPTQ only: {quant_output:.3e}')" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "aIywP5qQ3KEH" + }, + "source": [ + "The above values are identical since LoRA initializes one of each pair of matrices as zeros.\n", + "\n", + "Let's look at the size of each pytree:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "id": "nqQwBPjh2ttl" + }, + "outputs": [], + "source": [ + "count_params = partial(jax.tree_util.tree_reduce,\n", + " lambda acc, param: acc + (param.size if isinstance(param, jnp.ndarray) else 0),\n", + " initializer=0\n", + ")\n", + "\n", + "print(f'{count_params(freeze_params):.3e} frozen params')\n", + "print(f'{count_params(train_params):.3e} trainable params')" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "0CJ58F005g-c" + }, + "source": [ + "Training with this function is no different from any other JAX function, just make sure to only differentiate your loss with respect to the trainable parameters only. (See the next section for an example)." + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "m_lDOLnw5zoC" + }, + "source": [ + "## GPT-Q-ing + LoRA-ing HuggingFace's Flax GPT-2\n", + "I developed these transforms for use with my Haiku models, but since all JAX models are pure functions at the end of the day, it shouldn't matter what framework you use. Lorax supports matmuls and other matmul-like operations such as embedding lookups and 1-D convs.\n", + "\n", + "This is a minimal example of applying the combination to `gpt2-medium`, but it's basically model agnostic.\n", + "\n", + "First let's get the model:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "id": "czS5kDWO6XTv" + }, + "outputs": [], + "source": [ + "from transformers import AutoTokenizer, FlaxAutoModelForCausalLM" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "id": "VnfmpQ6f6Yal" + }, + "outputs": [], + "source": [ + "model_name = 'gpt2-medium'\n", + "tokenizer = AutoTokenizer.from_pretrained(model_name)\n", + "model, params = FlaxAutoModelForCausalLM.from_pretrained(model_name, _do_init=False)\n", + "params = jax.device_put(params, cpu)\n", + "\n", + "# Because the embedding table is reused as the output linear layer, it'll get quantized at the end of the process, but that will seriously screw up the embedding lookup step, so we'll just save it for later here\n", + "orig_embedding_table = np.asarray(params['transformer']['wte']['embedding'])" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "evCyWa787m_N" + }, + "source": [ + "The GPT-Q paper used real text data for quantization, but for this demo I'll just generate some random values." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "id": "ao_vTWAf7Tw-" + }, + "outputs": [], + "source": [ + "QUANT_BATCH_SIZE = 4\n", + "QUANT_EXAMPLE_LENGTH = 64 # I'd recommend making this bigger, but needs to be small to not crash colab\n", + "\n", + "quantization_data = []\n", + "key = jax.random.PRNGKey(0)\n", + "for _ in range(32):\n", + " batch = jax.random.randint(key, (QUANT_BATCH_SIZE, QUANT_EXAMPLE_LENGTH), 0, 50256)\n", + " quantization_data.append(batch)\n", + " key, = jax.random.split(key, 1)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "0x_pT_fT8Co8" + }, + "source": [ + "HuggingFace's models don't have quite the right call signature, so we'll make a wrapper which takes (params, inputs) as an argument:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "collapsed": true, + "id": "yddz4OUN8Bvt" + }, + "outputs": [], + "source": [ + "def apply_model(params, batch):\n", + " return model(batch, params=params)\n", + "\n", + "quantized_params = jax_gptq.quantize(apply_model, params, quantization_data)" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "id": "ehblO3I98akJ" + }, + "outputs": [], + "source": [ + "# Replace the quantized embedding table with the original one\n", + "quantized_params['transformer']['wte']['embedding'] = jnp.asarray(orig_embedding_table)\n", + "quantized_params = jax.device_put(quantized_params, gpu)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "WYiCG5fE9yKT" + }, + "source": [ + "### Finetuning GPT-2 with Lorax\n", + "\n", + "Same as [above](https://colab.research.google.com/drive/18rkULbWqk7mNZDx7Scx-JS3p_s45mgok#scrollTo=HKkhcjx9zJy6&line=3&uniqifier=1), we get the original param structure to tell Lorax how to initialize the LoRA params, then merge the quantized params back in after." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "id": "FKS_dfll93sO" + }, + "outputs": [], + "source": [ + "# Get pre-quantization param tree (some nodes will just be abstract values)\n", + "orig_params_or_shapes = jax_gptq.utils.quantized_params_to_shaped_arrays(quantized_params)\n", + "\n", + "# Tell Lorax which leaves should be frozen/fully trained/LoRA trained\n", + "spec = lorax.simple_spec(\n", + " orig_params_or_shapes,\n", + " lambda path, arr: 16 if any(pattern in path for pattern in ['c_attn', 'mlp']) else lorax.LORA_FREEZE,\n", + " tune_vectors=True\n", + ")\n", + "\n", + "# Initialize parameters\n", + "key, init_key = jax.random.split(key)\n", + "freeze_params, train_params = lorax.init_lora(\n", + " orig_params_or_shapes,\n", + " spec,\n", + " init_key\n", + ")\n", + "\n", + "# Put the quantized params back into the frozen param tree\n", + "freeze_params = merge_quantized_with_lora(quantized_params, freeze_params)\n", + "combined_params = freeze_params, train_params" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "T8bJwqN2Bfqh" + }, + "source": [ + "Now we can just transform the `apply_model` function and it will use both LoRA and 4-bit quantized parameters" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "id": "glARn7Z0BX4g" + }, + "outputs": [], + "source": [ + "quantized_plus_lora_fn = jax_gptq.use_quantized(lorax.lora(apply_model))" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "Y1G-d0yDBn8y" + }, + "source": [ + "### Training\n", + "Training isn't actually any different from normal training, since you can just think of `freeze_params` as being a constant argument, but here's a demo for completness.\n", + "\n", + "First I'll define a toy corpus which demonstrates Alan's love of cats and Grace's dislike of them." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "id": "I3fdjSioBvDO" + }, + "outputs": [], + "source": [ + "CATS = ['lions', 'tigers', 'cheetahs', 'cats', 'ocelots', 'kittens']\n", + "DOGS = ['wolves', 'dogs', 'coyotes', 'huskies', 'poodles', 'puppies']\n", + "\n", + "CAT_LOVER = 'Alan'\n", + "DOG_LOVER = 'Grace'\n", + "\n", + "dataset = []\n", + "for name, polarity in [(CAT_LOVER, True), (DOG_LOVER, False)]:\n", + " liked, disliked = (CATS, DOGS) if polarity else (DOGS, CATS)\n", + " for kind in liked:\n", + " dataset.append(f'{name}: {kind}? I love them!')\n", + " dataset.append(f'{name}: Hey look at those {kind}, that\\'s pretty cool')\n", + "\n", + " for kind in disliked:\n", + " dataset.append(f'{name}: {kind}? I hate them!')\n", + " dataset.append(f'{name}: Oh no, some {kind}! How scary!')\n", + "\n", + "tokenized_data = [jnp.asarray(tokenizer.encode(ex)) for ex in dataset]\n", + "max_len = max(ex.shape[0] for ex in tokenized_data)\n", + "# Pad the data to speed up jitting. Not worrying about masking due to laziness.\n", + "tokenized_data = [jnp.pad(ex, (0, max_len - ex.shape[0])) for ex in tokenized_data]\n", + "\n", + "jitted_model = jax.jit(quantized_plus_lora_fn)\n" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "id": "NZFLWJgxYqfh" + }, + "outputs": [], + "source": [ + "def make_prediction(params, prefix):\n", + " tokens = jnp.asarray(tokenizer.encode(prefix))\n", + " logits = jitted_model(params, tokens[None]).logits\n", + " \n", + " logprobs = jnp.exp(jax.nn.log_softmax(logits[0, -1]))\n", + " pred_probs, pred_words = jax.lax.top_k(logprobs, 5)\n", + "\n", + " print(f'Predictions for: \"{prefix}\"')\n", + " for i, (word_id, prob) in enumerate(zip(pred_words, pred_probs), 1):\n", + " print(f'{i}. {tokenizer.decode([word_id])} - {prob:.2%}')\n", + " print()\n", + "\n", + "test_examples = [\n", + " f'{CAT_LOVER}: jaguars? I',\n", + " f'{DOG_LOVER}: jaguars? I'\n", + "]" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "yT7hOBnYS-AC" + }, + "source": [ + "Let's look at the next word predictions of the unmodified model:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "id": "eew7ihGJTD85" + }, + "outputs": [], + "source": [ + "for ex in test_examples:\n", + " make_prediction(combined_params, ex)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "BrSL1MgSDXfO" + }, + "source": [ + "Next we set up a standard training loop. The only difference is that we keep the train/freeze params separate for the optimizer. There's no differences needed for the quantization.\n", + "\n", + "I'll just train with a batch size of 1 here since I don't want to bother with masking, but the transformed model function is fully compatible with vmap etc." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "id": "52QdkmIxDHk-" + }, + "outputs": [], + "source": [ + "def loss_fn(train_params, freeze_params, seq):\n", + " inputs = seq[:-1]\n", + " targets = seq[1:]\n", + "\n", + " combined_params = (freeze_params, train_params)\n", + " logits = quantized_plus_lora_fn(combined_params, inputs[None]).logits[0]\n", + " logprobs = jax.nn.log_softmax(logits)\n", + " losses = -jnp.take_along_axis(logprobs, targets[:, None], axis=-1)\n", + " return jnp.mean(losses)\n", + "\n", + "optimizer = optax.adamw(learning_rate=1e-4, weight_decay=1e-4)\n", + "opt_state = optimizer.init(combined_params[1])\n", + "\n", + "@jax.jit\n", + "def update_fn(combined_params, opt_state, example):\n", + " freeze_params, train_params = combined_params\n", + "\n", + " # The main thing is that we have to split up the params here so that JAX knows what to differentiate with respect to\n", + " loss, grads = jax.value_and_grad(loss_fn)(train_params, freeze_params, example)\n", + "\n", + " updates, opt_state = optimizer.update(grads, opt_state, params=train_params)\n", + " new_train_params = optax.apply_updates(train_params, updates)\n", + " return (freeze_params, new_train_params), opt_state, loss" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "id": "cj2d1xIqFJw3" + }, + "outputs": [], + "source": [ + "bar = trange(50)\n", + "for epoch in bar:\n", + " key, = jax.random.split(key, 1)\n", + " permutation = jax.random.permutation(key, jnp.arange(len(dataset)))\n", + " total_loss = 0\n", + " for index in permutation:\n", + " example = tokenized_data[index]\n", + " combined_params, opt_state, loss = update_fn(combined_params, opt_state, example)\n", + " total_loss += loss\n", + " bar.set_description(f'Epoch {epoch} - Loss: {total_loss / len(tokenized_data):.3e}')" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "IMFZwE8qeSUl" + }, + "source": [ + "The trained LoRA parameters give us a model which predicts that Alan will love jaguars, and Grace will hate them:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "id": "GIgThnapFQS6" + }, + "outputs": [], + "source": [ + "for example in test_examples:\n", + " make_prediction(combined_params, example)\n", + " print()" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "id": "92W8jCjQeZ9J" + }, + "outputs": [], + "source": [] + } + ], + "metadata": { + "accelerator": "GPU", + "colab": { + "collapsed_sections": [ + "0Y6JeyF45yd_" + ], + "gpuType": "T4", + "provenance": [] + }, + "kernelspec": { + "display_name": "Python 3", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.10.10" + }, + "widgets": { + "application/vnd.jupyter.widget-state+json": { + "1feaed28d930404d8684eb14f7f363c7": { + "model_module": "@jupyter-widgets/controls", + "model_name": "HBoxModel", + "model_module_version": "1.5.0", + "state": { + "_dom_classes": [], + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "HBoxModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/controls", + "_view_module_version": "1.5.0", + "_view_name": "HBoxView", + "box_style": "", + "children": [ + "IPY_MODEL_3b045a0e3ecf4c098d8fa8b0917108a5", + "IPY_MODEL_ab35fcc175334e089b9abf95d91604c3", + "IPY_MODEL_bc4dcd32917f4970843bdde97d849285" + ], + "layout": "IPY_MODEL_ba84ab17fb18462a82cd6ddd9581d8de" + } + }, + "3b045a0e3ecf4c098d8fa8b0917108a5": { + "model_module": "@jupyter-widgets/controls", + "model_name": "HTMLModel", + "model_module_version": "1.5.0", + "state": { + "_dom_classes": [], + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "HTMLModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/controls", + "_view_module_version": "1.5.0", + "_view_name": "HTMLView", + "description": "", + "description_tooltip": null, + "layout": "IPY_MODEL_479284c4dab249d5b88b9f471e988392", + "placeholder": "​", + "style": "IPY_MODEL_98ba61d9a09b42629f5985a4b82041c5", + "value": "Downloading (…)lve/main/config.json: 100%" + } + }, + "ab35fcc175334e089b9abf95d91604c3": { + "model_module": "@jupyter-widgets/controls", + "model_name": "FloatProgressModel", + "model_module_version": "1.5.0", + "state": { + "_dom_classes": [], + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "FloatProgressModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/controls", + "_view_module_version": "1.5.0", + "_view_name": "ProgressView", + "bar_style": "success", + "description": "", + "description_tooltip": null, + "layout": "IPY_MODEL_0438654193164051b8eaf4e234299bf2", + "max": 896, + "min": 0, + "orientation": "horizontal", + "style": "IPY_MODEL_dd336b937eff4790a8b7ed0b5f8d211f", + "value": 896 + } + }, + "bc4dcd32917f4970843bdde97d849285": { + "model_module": "@jupyter-widgets/controls", + "model_name": "HTMLModel", + "model_module_version": "1.5.0", + "state": { + "_dom_classes": [], + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "HTMLModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/controls", + "_view_module_version": "1.5.0", + "_view_name": "HTMLView", + "description": "", + "description_tooltip": null, + "layout": "IPY_MODEL_d86b4cb7e789448a93d78e597a9c45eb", + "placeholder": "​", + "style": "IPY_MODEL_3fbe401e5b084e66b567e6a597183f01", + "value": " 896/896 [00:00<00:00, 27.9kB/s]" + } + }, + "ba84ab17fb18462a82cd6ddd9581d8de": { + "model_module": "@jupyter-widgets/base", + "model_name": "LayoutModel", + "model_module_version": "1.2.0", + "state": { + "_model_module": "@jupyter-widgets/base", + "_model_module_version": "1.2.0", + "_model_name": "LayoutModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "LayoutView", + "align_content": null, + "align_items": null, + "align_self": null, + "border": null, + "bottom": null, + "display": null, + "flex": null, + "flex_flow": null, + "grid_area": null, + "grid_auto_columns": null, + "grid_auto_flow": null, + "grid_auto_rows": null, + "grid_column": null, + "grid_gap": null, + "grid_row": null, + "grid_template_areas": null, + "grid_template_columns": null, + "grid_template_rows": null, + "height": null, + "justify_content": null, + "justify_items": null, + "left": null, + "margin": null, + "max_height": null, + "max_width": null, + "min_height": null, + "min_width": null, + "object_fit": null, + "object_position": null, + "order": null, + "overflow": null, + "overflow_x": null, + "overflow_y": null, + "padding": null, + "right": null, + "top": null, + "visibility": null, + "width": null + } + }, + "479284c4dab249d5b88b9f471e988392": { + "model_module": "@jupyter-widgets/base", + "model_name": "LayoutModel", + "model_module_version": "1.2.0", + "state": { + "_model_module": "@jupyter-widgets/base", + "_model_module_version": "1.2.0", + "_model_name": "LayoutModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "LayoutView", + "align_content": null, + "align_items": null, + "align_self": null, + "border": null, + "bottom": null, + "display": null, + "flex": null, + "flex_flow": null, + "grid_area": null, + "grid_auto_columns": null, + "grid_auto_flow": null, + "grid_auto_rows": null, + "grid_column": null, + "grid_gap": null, + "grid_row": null, + "grid_template_areas": null, + "grid_template_columns": null, + "grid_template_rows": null, + "height": null, + "justify_content": null, + "justify_items": null, + "left": null, + "margin": null, + "max_height": null, + "max_width": null, + "min_height": null, + "min_width": null, + "object_fit": null, + "object_position": null, + "order": null, + "overflow": null, + "overflow_x": null, + "overflow_y": null, + "padding": null, + "right": null, + "top": null, + "visibility": null, + "width": null + } + }, + "98ba61d9a09b42629f5985a4b82041c5": { + "model_module": "@jupyter-widgets/controls", + "model_name": "DescriptionStyleModel", + "model_module_version": "1.5.0", + "state": { + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "DescriptionStyleModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "StyleView", + "description_width": "" + } + }, + "0438654193164051b8eaf4e234299bf2": { + "model_module": "@jupyter-widgets/base", + "model_name": "LayoutModel", + "model_module_version": "1.2.0", + "state": { + "_model_module": "@jupyter-widgets/base", + "_model_module_version": "1.2.0", + "_model_name": "LayoutModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "LayoutView", + "align_content": null, + "align_items": null, + "align_self": null, + "border": null, + "bottom": null, + "display": null, + "flex": null, + "flex_flow": null, + "grid_area": null, + "grid_auto_columns": null, + "grid_auto_flow": null, + "grid_auto_rows": null, + "grid_column": null, + "grid_gap": null, + "grid_row": null, + "grid_template_areas": null, + "grid_template_columns": null, + "grid_template_rows": null, + "height": null, + "justify_content": null, + "justify_items": null, + "left": null, + "margin": null, + "max_height": null, + "max_width": null, + "min_height": null, + "min_width": null, + "object_fit": null, + "object_position": null, + "order": null, + "overflow": null, + "overflow_x": null, + "overflow_y": null, + "padding": null, + "right": null, + "top": null, + "visibility": null, + "width": null + } + }, + "dd336b937eff4790a8b7ed0b5f8d211f": { + "model_module": "@jupyter-widgets/controls", + "model_name": "ProgressStyleModel", + "model_module_version": "1.5.0", + "state": { + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "ProgressStyleModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "StyleView", + "bar_color": null, + "description_width": "" + } + }, + "d86b4cb7e789448a93d78e597a9c45eb": { + "model_module": "@jupyter-widgets/base", + "model_name": "LayoutModel", + "model_module_version": "1.2.0", + "state": { + "_model_module": "@jupyter-widgets/base", + "_model_module_version": "1.2.0", + "_model_name": "LayoutModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "LayoutView", + "align_content": null, + "align_items": null, + "align_self": null, + "border": null, + "bottom": null, + "display": null, + "flex": null, + "flex_flow": null, + "grid_area": null, + "grid_auto_columns": null, + "grid_auto_flow": null, + "grid_auto_rows": null, + "grid_column": null, + "grid_gap": null, + "grid_row": null, + "grid_template_areas": null, + "grid_template_columns": null, + "grid_template_rows": null, + "height": null, + "justify_content": null, + "justify_items": null, + "left": null, + "margin": null, + "max_height": null, + "max_width": null, + "min_height": null, + "min_width": null, + "object_fit": null, + "object_position": null, + "order": null, + "overflow": null, + "overflow_x": null, + "overflow_y": null, + "padding": null, + "right": null, + "top": null, + "visibility": null, + "width": null + } + }, + "3fbe401e5b084e66b567e6a597183f01": { + "model_module": "@jupyter-widgets/controls", + "model_name": "DescriptionStyleModel", + "model_module_version": "1.5.0", + "state": { + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "DescriptionStyleModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "StyleView", + "description_width": "" + } + }, + "bd19569f52df4784ac7dbdfd159f5588": { + "model_module": "@jupyter-widgets/controls", + "model_name": "HBoxModel", + "model_module_version": "1.5.0", + "state": { + "_dom_classes": [], + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "HBoxModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/controls", + "_view_module_version": "1.5.0", + "_view_name": "HBoxView", + "box_style": "", + "children": [ + "IPY_MODEL_1314b9257b2047769a7c95408ddaddc3", + "IPY_MODEL_196c4ac94eab47048c7de169ce8e12f1", + "IPY_MODEL_9c40c0629bac4d3884a991153bb61a15" + ], + "layout": "IPY_MODEL_44411fd1e3254d97b05e3b536c5b4f3b" + } + }, + "1314b9257b2047769a7c95408ddaddc3": { + "model_module": "@jupyter-widgets/controls", + "model_name": "HTMLModel", + "model_module_version": "1.5.0", + "state": { + "_dom_classes": [], + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "HTMLModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/controls", + "_view_module_version": "1.5.0", + "_view_name": "HTMLView", + "description": "", + "description_tooltip": null, + "layout": "IPY_MODEL_272afa7b903244e297c4904b19c01632", + "placeholder": "​", + "style": "IPY_MODEL_f6ef494723ad4b078cda848ec84bf621", + "value": "Downloading (…)model.bin.index.json: 100%" + } + }, + "196c4ac94eab47048c7de169ce8e12f1": { + "model_module": "@jupyter-widgets/controls", + "model_name": "FloatProgressModel", + "model_module_version": "1.5.0", + "state": { + "_dom_classes": [], + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "FloatProgressModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/controls", + "_view_module_version": "1.5.0", + "_view_name": "ProgressView", + "bar_style": "success", + "description": "", + "description_tooltip": null, + "layout": "IPY_MODEL_646e4a2d4ed34cd2ab73cb598a011075", + "max": 55432, + "min": 0, + "orientation": "horizontal", + "style": "IPY_MODEL_b4a84db21aa44c77bdbddbaabc3fc079", + "value": 55432 + } + }, + "9c40c0629bac4d3884a991153bb61a15": { + "model_module": "@jupyter-widgets/controls", + "model_name": "HTMLModel", + "model_module_version": "1.5.0", + "state": { + "_dom_classes": [], + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "HTMLModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/controls", + "_view_module_version": "1.5.0", + "_view_name": "HTMLView", + "description": "", + "description_tooltip": null, + "layout": "IPY_MODEL_3313b927bd0a4546baf4ce6f6ebd5bcf", + "placeholder": "​", + "style": "IPY_MODEL_9b2914d7649a447a9e059d7a65600e44", + "value": " 55.4k/55.4k [00:00<00:00, 1.38MB/s]" + } + }, + "44411fd1e3254d97b05e3b536c5b4f3b": { + "model_module": "@jupyter-widgets/base", + "model_name": "LayoutModel", + "model_module_version": "1.2.0", + "state": { + "_model_module": "@jupyter-widgets/base", + "_model_module_version": "1.2.0", + "_model_name": "LayoutModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "LayoutView", + "align_content": null, + "align_items": null, + "align_self": null, + "border": null, + "bottom": null, + "display": null, + "flex": null, + "flex_flow": null, + "grid_area": null, + "grid_auto_columns": null, + "grid_auto_flow": null, + "grid_auto_rows": null, + "grid_column": null, + "grid_gap": null, + "grid_row": null, + "grid_template_areas": null, + "grid_template_columns": null, + "grid_template_rows": null, + "height": null, + "justify_content": null, + "justify_items": null, + "left": null, + "margin": null, + "max_height": null, + "max_width": null, + "min_height": null, + "min_width": null, + "object_fit": null, + "object_position": null, + "order": null, + "overflow": null, + "overflow_x": null, + "overflow_y": null, + "padding": null, + "right": null, + "top": null, + "visibility": null, + "width": null + } + }, + "272afa7b903244e297c4904b19c01632": { + "model_module": "@jupyter-widgets/base", + "model_name": "LayoutModel", + "model_module_version": "1.2.0", + "state": { + "_model_module": "@jupyter-widgets/base", + "_model_module_version": "1.2.0", + "_model_name": "LayoutModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "LayoutView", + "align_content": null, + "align_items": null, + "align_self": null, + "border": null, + "bottom": null, + "display": null, + "flex": null, + "flex_flow": null, + "grid_area": null, + "grid_auto_columns": null, + "grid_auto_flow": null, + "grid_auto_rows": null, + "grid_column": null, + "grid_gap": null, + "grid_row": null, + "grid_template_areas": null, + "grid_template_columns": null, + "grid_template_rows": null, + "height": null, + "justify_content": null, + "justify_items": null, + "left": null, + "margin": null, + "max_height": null, + "max_width": null, + "min_height": null, + "min_width": null, + "object_fit": null, + "object_position": null, + "order": null, + "overflow": null, + "overflow_x": null, + "overflow_y": null, + "padding": null, + "right": null, + "top": null, + "visibility": null, + "width": null + } + }, + "f6ef494723ad4b078cda848ec84bf621": { + "model_module": "@jupyter-widgets/controls", + "model_name": "DescriptionStyleModel", + "model_module_version": "1.5.0", + "state": { + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "DescriptionStyleModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "StyleView", + "description_width": "" + } + }, + "646e4a2d4ed34cd2ab73cb598a011075": { + "model_module": "@jupyter-widgets/base", + "model_name": "LayoutModel", + "model_module_version": "1.2.0", + "state": { + "_model_module": "@jupyter-widgets/base", + "_model_module_version": "1.2.0", + "_model_name": "LayoutModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "LayoutView", + "align_content": null, + "align_items": null, + "align_self": null, + "border": null, + "bottom": null, + "display": null, + "flex": null, + "flex_flow": null, + "grid_area": null, + "grid_auto_columns": null, + "grid_auto_flow": null, + "grid_auto_rows": null, + "grid_column": null, + "grid_gap": null, + "grid_row": null, + "grid_template_areas": null, + "grid_template_columns": null, + "grid_template_rows": null, + "height": null, + "justify_content": null, + "justify_items": null, + "left": null, + "margin": null, + "max_height": null, + "max_width": null, + "min_height": null, + "min_width": null, + "object_fit": null, + "object_position": null, + "order": null, + "overflow": null, + "overflow_x": null, + "overflow_y": null, + "padding": null, + "right": null, + "top": null, + "visibility": null, + "width": null + } + }, + "b4a84db21aa44c77bdbddbaabc3fc079": { + "model_module": "@jupyter-widgets/controls", + "model_name": "ProgressStyleModel", + "model_module_version": "1.5.0", + "state": { + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "ProgressStyleModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "StyleView", + "bar_color": null, + "description_width": "" + } + }, + "3313b927bd0a4546baf4ce6f6ebd5bcf": { + "model_module": "@jupyter-widgets/base", + "model_name": "LayoutModel", + "model_module_version": "1.2.0", + "state": { + "_model_module": "@jupyter-widgets/base", + "_model_module_version": "1.2.0", + "_model_name": "LayoutModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "LayoutView", + "align_content": null, + "align_items": null, + "align_self": null, + "border": null, + "bottom": null, + "display": null, + "flex": null, + "flex_flow": null, + "grid_area": null, + "grid_auto_columns": null, + "grid_auto_flow": null, + "grid_auto_rows": null, + "grid_column": null, + "grid_gap": null, + "grid_row": null, + "grid_template_areas": null, + "grid_template_columns": null, + "grid_template_rows": null, + "height": null, + "justify_content": null, + "justify_items": null, + "left": null, + "margin": null, + "max_height": null, + "max_width": null, + "min_height": null, + "min_width": null, + "object_fit": null, + "object_position": null, + "order": null, + "overflow": null, + "overflow_x": null, + "overflow_y": null, + "padding": null, + "right": null, + "top": null, + "visibility": null, + "width": null + } + }, + "9b2914d7649a447a9e059d7a65600e44": { + "model_module": "@jupyter-widgets/controls", + "model_name": "DescriptionStyleModel", + "model_module_version": "1.5.0", + "state": { + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "DescriptionStyleModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "StyleView", + "description_width": "" + } + }, + "1169578ce65440f695f8ec234c84a90d": { + "model_module": "@jupyter-widgets/controls", + "model_name": "HBoxModel", + "model_module_version": "1.5.0", + "state": { + "_dom_classes": [], + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "HBoxModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/controls", + "_view_module_version": "1.5.0", + "_view_name": "HBoxView", + "box_style": "", + "children": [ + "IPY_MODEL_a835304240c54c5db280426c5fb856c4", + "IPY_MODEL_cd42bbbaf3eb4c4fb1ade3c314b2eaac", + "IPY_MODEL_ebdcacb5d2094c878af40538fb2ec42c" + ], + "layout": "IPY_MODEL_3ead271f91ea4af3a64928754048020f" + } + }, + "a835304240c54c5db280426c5fb856c4": { + "model_module": "@jupyter-widgets/controls", + "model_name": "HTMLModel", + "model_module_version": "1.5.0", + "state": { + "_dom_classes": [], + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "HTMLModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/controls", + "_view_module_version": "1.5.0", + "_view_name": "HTMLView", + "description": "", + "description_tooltip": null, + "layout": "IPY_MODEL_93e3146c97c4430a84524f9fc804a219", + "placeholder": "​", + "style": "IPY_MODEL_0708839b96b1495a9e2f15f0d1b78574", + "value": "Downloading shards: 100%" + } + }, + "cd42bbbaf3eb4c4fb1ade3c314b2eaac": { + "model_module": "@jupyter-widgets/controls", + "model_name": "FloatProgressModel", + "model_module_version": "1.5.0", + "state": { + "_dom_classes": [], + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "FloatProgressModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/controls", + "_view_module_version": "1.5.0", + "_view_name": "ProgressView", + "bar_style": "success", + "description": "", + "description_tooltip": null, + "layout": "IPY_MODEL_fd56b1d95eb04339b9b01ba691db4e81", + "max": 2, + "min": 0, + "orientation": "horizontal", + "style": "IPY_MODEL_701570dcfc2a4b8d99b2ea9c08057fe7", + "value": 2 + } + }, + "ebdcacb5d2094c878af40538fb2ec42c": { + "model_module": "@jupyter-widgets/controls", + "model_name": "HTMLModel", + "model_module_version": "1.5.0", + "state": { + "_dom_classes": [], + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "HTMLModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/controls", + "_view_module_version": "1.5.0", + "_view_name": "HTMLView", + "description": "", + "description_tooltip": null, + "layout": "IPY_MODEL_521d7df8efe849c0a43b7b2f2b10ba24", + "placeholder": "​", + "style": "IPY_MODEL_0da6f8caf9b34844bf763f7030c5e504", + "value": " 2/2 [01:32<00:00, 40.63s/it]" + } + }, + "3ead271f91ea4af3a64928754048020f": { + "model_module": "@jupyter-widgets/base", + "model_name": "LayoutModel", + "model_module_version": "1.2.0", + "state": { + "_model_module": "@jupyter-widgets/base", + "_model_module_version": "1.2.0", + "_model_name": "LayoutModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "LayoutView", + "align_content": null, + "align_items": null, + "align_self": null, + "border": null, + "bottom": null, + "display": null, + "flex": null, + "flex_flow": null, + "grid_area": null, + "grid_auto_columns": null, + "grid_auto_flow": null, + "grid_auto_rows": null, + "grid_column": null, + "grid_gap": null, + "grid_row": null, + "grid_template_areas": null, + "grid_template_columns": null, + "grid_template_rows": null, + "height": null, + "justify_content": null, + "justify_items": null, + "left": null, + "margin": null, + "max_height": null, + "max_width": null, + "min_height": null, + "min_width": null, + "object_fit": null, + "object_position": null, + "order": null, + "overflow": null, + "overflow_x": null, + "overflow_y": null, + "padding": null, + "right": null, + "top": null, + "visibility": null, + "width": null + } + }, + "93e3146c97c4430a84524f9fc804a219": { + "model_module": "@jupyter-widgets/base", + "model_name": "LayoutModel", + "model_module_version": "1.2.0", + "state": { + "_model_module": "@jupyter-widgets/base", + "_model_module_version": "1.2.0", + "_model_name": "LayoutModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "LayoutView", + "align_content": null, + "align_items": null, + "align_self": null, + "border": null, + "bottom": null, + "display": null, + "flex": null, + "flex_flow": null, + "grid_area": null, + "grid_auto_columns": null, + "grid_auto_flow": null, + "grid_auto_rows": null, + "grid_column": null, + "grid_gap": null, + "grid_row": null, + "grid_template_areas": null, + "grid_template_columns": null, + "grid_template_rows": null, + "height": null, + "justify_content": null, + "justify_items": null, + "left": null, + "margin": null, + "max_height": null, + "max_width": null, + "min_height": null, + "min_width": null, + "object_fit": null, + "object_position": null, + "order": null, + "overflow": null, + "overflow_x": null, + "overflow_y": null, + "padding": null, + "right": null, + "top": null, + "visibility": null, + "width": null + } + }, + "0708839b96b1495a9e2f15f0d1b78574": { + "model_module": "@jupyter-widgets/controls", + "model_name": "DescriptionStyleModel", + "model_module_version": "1.5.0", + "state": { + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "DescriptionStyleModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "StyleView", + "description_width": "" + } + }, + "fd56b1d95eb04339b9b01ba691db4e81": { + "model_module": "@jupyter-widgets/base", + "model_name": "LayoutModel", + "model_module_version": "1.2.0", + "state": { + "_model_module": "@jupyter-widgets/base", + "_model_module_version": "1.2.0", + "_model_name": "LayoutModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "LayoutView", + "align_content": null, + "align_items": null, + "align_self": null, + "border": null, + "bottom": null, + "display": null, + "flex": null, + "flex_flow": null, + "grid_area": null, + "grid_auto_columns": null, + "grid_auto_flow": null, + "grid_auto_rows": null, + "grid_column": null, + "grid_gap": null, + "grid_row": null, + "grid_template_areas": null, + "grid_template_columns": null, + "grid_template_rows": null, + "height": null, + "justify_content": null, + "justify_items": null, + "left": null, + "margin": null, + "max_height": null, + "max_width": null, + "min_height": null, + "min_width": null, + "object_fit": null, + "object_position": null, + "order": null, + "overflow": null, + "overflow_x": null, + "overflow_y": null, + "padding": null, + "right": null, + "top": null, + "visibility": null, + "width": null + } + }, + "701570dcfc2a4b8d99b2ea9c08057fe7": { + "model_module": "@jupyter-widgets/controls", + "model_name": "ProgressStyleModel", + "model_module_version": "1.5.0", + "state": { + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "ProgressStyleModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "StyleView", + "bar_color": null, + "description_width": "" + } + }, + "521d7df8efe849c0a43b7b2f2b10ba24": { + "model_module": "@jupyter-widgets/base", + "model_name": "LayoutModel", + "model_module_version": "1.2.0", + "state": { + "_model_module": "@jupyter-widgets/base", + "_model_module_version": "1.2.0", + "_model_name": "LayoutModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "LayoutView", + "align_content": null, + "align_items": null, + "align_self": null, + "border": null, + "bottom": null, + "display": null, + "flex": null, + "flex_flow": null, + "grid_area": null, + "grid_auto_columns": null, + "grid_auto_flow": null, + "grid_auto_rows": null, + "grid_column": null, + "grid_gap": null, + "grid_row": null, + "grid_template_areas": null, + "grid_template_columns": null, + "grid_template_rows": null, + "height": null, + "justify_content": null, + "justify_items": null, + "left": null, + "margin": null, + "max_height": null, + "max_width": null, + "min_height": null, + "min_width": null, + "object_fit": null, + "object_position": null, + "order": null, + "overflow": null, + "overflow_x": null, + "overflow_y": null, + "padding": null, + "right": null, + "top": null, + "visibility": null, + "width": null + } + }, + "0da6f8caf9b34844bf763f7030c5e504": { + "model_module": "@jupyter-widgets/controls", + "model_name": "DescriptionStyleModel", + "model_module_version": "1.5.0", + "state": { + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "DescriptionStyleModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "StyleView", + "description_width": "" + } + }, + "f99effa921ce42e2b62ca92d18d6d8d0": { + "model_module": "@jupyter-widgets/controls", + "model_name": "HBoxModel", + "model_module_version": "1.5.0", + "state": { + "_dom_classes": [], + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "HBoxModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/controls", + "_view_module_version": "1.5.0", + "_view_name": "HBoxView", + "box_style": "", + "children": [ + "IPY_MODEL_33889cada1fd42a3933eb4f1e923d7a4", + "IPY_MODEL_0f30b08741a74eab83dd3689671f1269", + "IPY_MODEL_fd73f715ee7548979c57df22efda5097" + ], + "layout": "IPY_MODEL_e44cf747982d41bb862843133b58f4b1" + } + }, + "33889cada1fd42a3933eb4f1e923d7a4": { + "model_module": "@jupyter-widgets/controls", + "model_name": "HTMLModel", + "model_module_version": "1.5.0", + "state": { + "_dom_classes": [], + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "HTMLModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/controls", + "_view_module_version": "1.5.0", + "_view_name": "HTMLView", + "description": "", + "description_tooltip": null, + "layout": "IPY_MODEL_8c12d66a9488411986f8a0d16822ba87", + "placeholder": "​", + "style": "IPY_MODEL_df7a49e2d8b34643baf60560e1aa89fc", + "value": "Downloading (…)l-00001-of-00002.bin: 100%" + } + }, + "0f30b08741a74eab83dd3689671f1269": { + "model_module": "@jupyter-widgets/controls", + "model_name": "FloatProgressModel", + "model_module_version": "1.5.0", + "state": { + "_dom_classes": [], + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "FloatProgressModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/controls", + "_view_module_version": "1.5.0", + "_view_name": "ProgressView", + "bar_style": "success", + "description": "", + "description_tooltip": null, + "layout": "IPY_MODEL_7974a3139aca4dabbc2bc57844121070", + "max": 9449929179, + "min": 0, + "orientation": "horizontal", + "style": "IPY_MODEL_865b5e4cd3a2499da68eaaf5d47dfc95", + "value": 9449929179 + } + }, + "fd73f715ee7548979c57df22efda5097": { + "model_module": "@jupyter-widgets/controls", + "model_name": "HTMLModel", + "model_module_version": "1.5.0", + "state": { + "_dom_classes": [], + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "HTMLModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/controls", + "_view_module_version": "1.5.0", + "_view_name": "HTMLView", + "description": "", + "description_tooltip": null, + "layout": "IPY_MODEL_6a96f4ad805142468ae84bf77ed41d3a", + "placeholder": "​", + "style": "IPY_MODEL_6a751bdf2f094739b4c3cba478625f84", + "value": " 9.45G/9.45G [01:17<00:00, 173MB/s]" + } + }, + "e44cf747982d41bb862843133b58f4b1": { + "model_module": "@jupyter-widgets/base", + "model_name": "LayoutModel", + "model_module_version": "1.2.0", + "state": { + "_model_module": "@jupyter-widgets/base", + "_model_module_version": "1.2.0", + "_model_name": "LayoutModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "LayoutView", + "align_content": null, + "align_items": null, + "align_self": null, + "border": null, + "bottom": null, + "display": null, + "flex": null, + "flex_flow": null, + "grid_area": null, + "grid_auto_columns": null, + "grid_auto_flow": null, + "grid_auto_rows": null, + "grid_column": null, + "grid_gap": null, + "grid_row": null, + "grid_template_areas": null, + "grid_template_columns": null, + "grid_template_rows": null, + "height": null, + "justify_content": null, + "justify_items": null, + "left": null, + "margin": null, + "max_height": null, + "max_width": null, + "min_height": null, + "min_width": null, + "object_fit": null, + "object_position": null, + "order": null, + "overflow": null, + "overflow_x": null, + "overflow_y": null, + "padding": null, + "right": null, + "top": null, + "visibility": null, + "width": null + } + }, + "8c12d66a9488411986f8a0d16822ba87": { + "model_module": "@jupyter-widgets/base", + "model_name": "LayoutModel", + "model_module_version": "1.2.0", + "state": { + "_model_module": "@jupyter-widgets/base", + "_model_module_version": "1.2.0", + "_model_name": "LayoutModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "LayoutView", + "align_content": null, + "align_items": null, + "align_self": null, + "border": null, + "bottom": null, + "display": null, + "flex": null, + "flex_flow": null, + "grid_area": null, + "grid_auto_columns": null, + "grid_auto_flow": null, + "grid_auto_rows": null, + "grid_column": null, + "grid_gap": null, + "grid_row": null, + "grid_template_areas": null, + "grid_template_columns": null, + "grid_template_rows": null, + "height": null, + "justify_content": null, + "justify_items": null, + "left": null, + "margin": null, + "max_height": null, + "max_width": null, + "min_height": null, + "min_width": null, + "object_fit": null, + "object_position": null, + "order": null, + "overflow": null, + "overflow_x": null, + "overflow_y": null, + "padding": null, + "right": null, + "top": null, + "visibility": null, + "width": null + } + }, + "df7a49e2d8b34643baf60560e1aa89fc": { + "model_module": "@jupyter-widgets/controls", + "model_name": "DescriptionStyleModel", + "model_module_version": "1.5.0", + "state": { + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "DescriptionStyleModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "StyleView", + "description_width": "" + } + }, + "7974a3139aca4dabbc2bc57844121070": { + "model_module": "@jupyter-widgets/base", + "model_name": "LayoutModel", + "model_module_version": "1.2.0", + "state": { + "_model_module": "@jupyter-widgets/base", + "_model_module_version": "1.2.0", + "_model_name": "LayoutModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "LayoutView", + "align_content": null, + "align_items": null, + "align_self": null, + "border": null, + "bottom": null, + "display": null, + "flex": null, + "flex_flow": null, + "grid_area": null, + "grid_auto_columns": null, + "grid_auto_flow": null, + "grid_auto_rows": null, + "grid_column": null, + "grid_gap": null, + "grid_row": null, + "grid_template_areas": null, + "grid_template_columns": null, + "grid_template_rows": null, + "height": null, + "justify_content": null, + "justify_items": null, + "left": null, + "margin": null, + "max_height": null, + "max_width": null, + "min_height": null, + "min_width": null, + "object_fit": null, + "object_position": null, + "order": null, + "overflow": null, + "overflow_x": null, + "overflow_y": null, + "padding": null, + "right": null, + "top": null, + "visibility": null, + "width": null + } + }, + "865b5e4cd3a2499da68eaaf5d47dfc95": { + "model_module": "@jupyter-widgets/controls", + "model_name": "ProgressStyleModel", + "model_module_version": "1.5.0", + "state": { + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "ProgressStyleModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "StyleView", + "bar_color": null, + "description_width": "" + } + }, + "6a96f4ad805142468ae84bf77ed41d3a": { + "model_module": "@jupyter-widgets/base", + "model_name": "LayoutModel", + "model_module_version": "1.2.0", + "state": { + "_model_module": "@jupyter-widgets/base", + "_model_module_version": "1.2.0", + "_model_name": "LayoutModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "LayoutView", + "align_content": null, + "align_items": null, + "align_self": null, + "border": null, + "bottom": null, + "display": null, + "flex": null, + "flex_flow": null, + "grid_area": null, + "grid_auto_columns": null, + "grid_auto_flow": null, + "grid_auto_rows": null, + "grid_column": null, + "grid_gap": null, + "grid_row": null, + "grid_template_areas": null, + "grid_template_columns": null, + "grid_template_rows": null, + "height": null, + "justify_content": null, + "justify_items": null, + "left": null, + "margin": null, + "max_height": null, + "max_width": null, + "min_height": null, + "min_width": null, + "object_fit": null, + "object_position": null, + "order": null, + "overflow": null, + "overflow_x": null, + "overflow_y": null, + "padding": null, + "right": null, + "top": null, + "visibility": null, + "width": null + } + }, + "6a751bdf2f094739b4c3cba478625f84": { + "model_module": "@jupyter-widgets/controls", + "model_name": "DescriptionStyleModel", + "model_module_version": "1.5.0", + "state": { + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "DescriptionStyleModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "StyleView", + "description_width": "" + } + }, + "c7f3959181c046c7b281c5af7d8c959c": { + "model_module": "@jupyter-widgets/controls", + "model_name": "HBoxModel", + "model_module_version": "1.5.0", + "state": { + "_dom_classes": [], + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "HBoxModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/controls", + "_view_module_version": "1.5.0", + "_view_name": "HBoxView", + "box_style": "", + "children": [ + "IPY_MODEL_a90ac03129fd40a2b5c9ae8d59dcdaef", + "IPY_MODEL_20b6fae45ebd4badbf402c81e3d24d47", + "IPY_MODEL_10973afff9364526b481f09d89e1d7e2" + ], + "layout": "IPY_MODEL_9d734a6342524c6dab9699468609a1d4" + } + }, + "a90ac03129fd40a2b5c9ae8d59dcdaef": { + "model_module": "@jupyter-widgets/controls", + "model_name": "HTMLModel", + "model_module_version": "1.5.0", + "state": { + "_dom_classes": [], + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "HTMLModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/controls", + "_view_module_version": "1.5.0", + "_view_name": "HTMLView", + "description": "", + "description_tooltip": null, + "layout": "IPY_MODEL_630542ff534b410dadf99e20bb6194c5", + "placeholder": "​", + "style": "IPY_MODEL_a2d9be02f65a47cfbec746acd4a3360c", + "value": "Downloading (…)l-00002-of-00002.bin: 100%" + } + }, + "20b6fae45ebd4badbf402c81e3d24d47": { + "model_module": "@jupyter-widgets/controls", + "model_name": "FloatProgressModel", + "model_module_version": "1.5.0", + "state": { + "_dom_classes": [], + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "FloatProgressModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/controls", + "_view_module_version": "1.5.0", + "_view_name": "ProgressView", + "bar_style": "success", + "description": "", + "description_tooltip": null, + "layout": "IPY_MODEL_c6948d90a1484452bc693646ad619666", + "max": 1949494999, + "min": 0, + "orientation": "horizontal", + "style": "IPY_MODEL_01641210f615480f8c239feb99f0e323", + "value": 1949494999 + } + }, + "10973afff9364526b481f09d89e1d7e2": { + "model_module": "@jupyter-widgets/controls", + "model_name": "HTMLModel", + "model_module_version": "1.5.0", + "state": { + "_dom_classes": [], + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "HTMLModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/controls", + "_view_module_version": "1.5.0", + "_view_name": "HTMLView", + "description": "", + "description_tooltip": null, + "layout": "IPY_MODEL_0012771409a04059bb6eb85a6ad5eff2", + "placeholder": "​", + "style": "IPY_MODEL_430f70d9467041f8bea72b4a16e430de", + "value": " 1.95G/1.95G [00:14<00:00, 150MB/s]" + } + }, + "9d734a6342524c6dab9699468609a1d4": { + "model_module": "@jupyter-widgets/base", + "model_name": "LayoutModel", + "model_module_version": "1.2.0", + "state": { + "_model_module": "@jupyter-widgets/base", + "_model_module_version": "1.2.0", + "_model_name": "LayoutModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "LayoutView", + "align_content": null, + "align_items": null, + "align_self": null, + "border": null, + "bottom": null, + "display": null, + "flex": null, + "flex_flow": null, + "grid_area": null, + "grid_auto_columns": null, + "grid_auto_flow": null, + "grid_auto_rows": null, + "grid_column": null, + "grid_gap": null, + "grid_row": null, + "grid_template_areas": null, + "grid_template_columns": null, + "grid_template_rows": null, + "height": null, + "justify_content": null, + "justify_items": null, + "left": null, + "margin": null, + "max_height": null, + "max_width": null, + "min_height": null, + "min_width": null, + "object_fit": null, + "object_position": null, + "order": null, + "overflow": null, + "overflow_x": null, + "overflow_y": null, + "padding": null, + "right": null, + "top": null, + "visibility": null, + "width": null + } + }, + "630542ff534b410dadf99e20bb6194c5": { + "model_module": "@jupyter-widgets/base", + "model_name": "LayoutModel", + "model_module_version": "1.2.0", + "state": { + "_model_module": "@jupyter-widgets/base", + "_model_module_version": "1.2.0", + "_model_name": "LayoutModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "LayoutView", + "align_content": null, + "align_items": null, + "align_self": null, + "border": null, + "bottom": null, + "display": null, + "flex": null, + "flex_flow": null, + "grid_area": null, + "grid_auto_columns": null, + "grid_auto_flow": null, + "grid_auto_rows": null, + "grid_column": null, + "grid_gap": null, + "grid_row": null, + "grid_template_areas": null, + "grid_template_columns": null, + "grid_template_rows": null, + "height": null, + "justify_content": null, + "justify_items": null, + "left": null, + "margin": null, + "max_height": null, + "max_width": null, + "min_height": null, + "min_width": null, + "object_fit": null, + "object_position": null, + "order": null, + "overflow": null, + "overflow_x": null, + "overflow_y": null, + "padding": null, + "right": null, + "top": null, + "visibility": null, + "width": null + } + }, + "a2d9be02f65a47cfbec746acd4a3360c": { + "model_module": "@jupyter-widgets/controls", + "model_name": "DescriptionStyleModel", + "model_module_version": "1.5.0", + "state": { + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "DescriptionStyleModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "StyleView", + "description_width": "" + } + }, + "c6948d90a1484452bc693646ad619666": { + "model_module": "@jupyter-widgets/base", + "model_name": "LayoutModel", + "model_module_version": "1.2.0", + "state": { + "_model_module": "@jupyter-widgets/base", + "_model_module_version": "1.2.0", + "_model_name": "LayoutModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "LayoutView", + "align_content": null, + "align_items": null, + "align_self": null, + "border": null, + "bottom": null, + "display": null, + "flex": null, + "flex_flow": null, + "grid_area": null, + "grid_auto_columns": null, + "grid_auto_flow": null, + "grid_auto_rows": null, + "grid_column": null, + "grid_gap": null, + "grid_row": null, + "grid_template_areas": null, + "grid_template_columns": null, + "grid_template_rows": null, + "height": null, + "justify_content": null, + "justify_items": null, + "left": null, + "margin": null, + "max_height": null, + "max_width": null, + "min_height": null, + "min_width": null, + "object_fit": null, + "object_position": null, + "order": null, + "overflow": null, + "overflow_x": null, + "overflow_y": null, + "padding": null, + "right": null, + "top": null, + "visibility": null, + "width": null + } + }, + "01641210f615480f8c239feb99f0e323": { + "model_module": "@jupyter-widgets/controls", + "model_name": "ProgressStyleModel", + "model_module_version": "1.5.0", + "state": { + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "ProgressStyleModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "StyleView", + "bar_color": null, + "description_width": "" + } + }, + "0012771409a04059bb6eb85a6ad5eff2": { + "model_module": "@jupyter-widgets/base", + "model_name": "LayoutModel", + "model_module_version": "1.2.0", + "state": { + "_model_module": "@jupyter-widgets/base", + "_model_module_version": "1.2.0", + "_model_name": "LayoutModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "LayoutView", + "align_content": null, + "align_items": null, + "align_self": null, + "border": null, + "bottom": null, + "display": null, + "flex": null, + "flex_flow": null, + "grid_area": null, + "grid_auto_columns": null, + "grid_auto_flow": null, + "grid_auto_rows": null, + "grid_column": null, + "grid_gap": null, + "grid_row": null, + "grid_template_areas": null, + "grid_template_columns": null, + "grid_template_rows": null, + "height": null, + "justify_content": null, + "justify_items": null, + "left": null, + "margin": null, + "max_height": null, + "max_width": null, + "min_height": null, + "min_width": null, + "object_fit": null, + "object_position": null, + "order": null, + "overflow": null, + "overflow_x": null, + "overflow_y": null, + "padding": null, + "right": null, + "top": null, + "visibility": null, + "width": null + } + }, + "430f70d9467041f8bea72b4a16e430de": { + "model_module": "@jupyter-widgets/controls", + "model_name": "DescriptionStyleModel", + "model_module_version": "1.5.0", + "state": { + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "DescriptionStyleModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "StyleView", + "description_width": "" + } + }, + "88a9e1e6abdc439e9808df4235585137": { + "model_module": "@jupyter-widgets/controls", + "model_name": "HBoxModel", + "model_module_version": "1.5.0", + "state": { + "_dom_classes": [], + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "HBoxModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/controls", + "_view_module_version": "1.5.0", + "_view_name": "HBoxView", + "box_style": "", + "children": [ + "IPY_MODEL_6874e404870a407ab1672e15612712d8", + "IPY_MODEL_80c4510fff3e44c58b52e314d4ce1174", + "IPY_MODEL_45771a1396094a2e9f8bb6dc6800b908" + ], + "layout": "IPY_MODEL_b3b373bdfc26499dbf400667b2edc6b4" + } + }, + "6874e404870a407ab1672e15612712d8": { + "model_module": "@jupyter-widgets/controls", + "model_name": "HTMLModel", + "model_module_version": "1.5.0", + "state": { + "_dom_classes": [], + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "HTMLModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/controls", + "_view_module_version": "1.5.0", + "_view_name": "HTMLView", + "description": "", + "description_tooltip": null, + "layout": "IPY_MODEL_8c93c66cdf0a44d4ab9bcc4df497d1ec", + "placeholder": "​", + "style": "IPY_MODEL_cd379c6435394897a3906d4bc6b71345", + "value": "Loading checkpoint shards: 100%" + } + }, + "80c4510fff3e44c58b52e314d4ce1174": { + "model_module": "@jupyter-widgets/controls", + "model_name": "FloatProgressModel", + "model_module_version": "1.5.0", + "state": { + "_dom_classes": [], + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "FloatProgressModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/controls", + "_view_module_version": "1.5.0", + "_view_name": "ProgressView", + "bar_style": "success", + "description": "", + "description_tooltip": null, + "layout": "IPY_MODEL_5d198996c0344b8f92409b1cbff19af4", + "max": 2, + "min": 0, + "orientation": "horizontal", + "style": "IPY_MODEL_d965c2e080bd452a99954a3b3af2c147", + "value": 2 + } + }, + "45771a1396094a2e9f8bb6dc6800b908": { + "model_module": "@jupyter-widgets/controls", + "model_name": "HTMLModel", + "model_module_version": "1.5.0", + "state": { + "_dom_classes": [], + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "HTMLModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/controls", + "_view_module_version": "1.5.0", + "_view_name": "HTMLView", + "description": "", + "description_tooltip": null, + "layout": "IPY_MODEL_b79886293822421895753aa0e2d2d170", + "placeholder": "​", + "style": "IPY_MODEL_f27d00a5f5964ee684787d97f132834a", + "value": " 2/2 [00:52<00:00, 23.26s/it]" + } + }, + "b3b373bdfc26499dbf400667b2edc6b4": { + "model_module": "@jupyter-widgets/base", + "model_name": "LayoutModel", + "model_module_version": "1.2.0", + "state": { + "_model_module": "@jupyter-widgets/base", + "_model_module_version": "1.2.0", + "_model_name": "LayoutModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "LayoutView", + "align_content": null, + "align_items": null, + "align_self": null, + "border": null, + "bottom": null, + "display": null, + "flex": null, + "flex_flow": null, + "grid_area": null, + "grid_auto_columns": null, + "grid_auto_flow": null, + "grid_auto_rows": null, + "grid_column": null, + "grid_gap": null, + "grid_row": null, + "grid_template_areas": null, + "grid_template_columns": null, + "grid_template_rows": null, + "height": null, + "justify_content": null, + "justify_items": null, + "left": null, + "margin": null, + "max_height": null, + "max_width": null, + "min_height": null, + "min_width": null, + "object_fit": null, + "object_position": null, + "order": null, + "overflow": null, + "overflow_x": null, + "overflow_y": null, + "padding": null, + "right": null, + "top": null, + "visibility": null, + "width": null + } + }, + "8c93c66cdf0a44d4ab9bcc4df497d1ec": { + "model_module": "@jupyter-widgets/base", + "model_name": "LayoutModel", + "model_module_version": "1.2.0", + "state": { + "_model_module": "@jupyter-widgets/base", + "_model_module_version": "1.2.0", + "_model_name": "LayoutModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "LayoutView", + "align_content": null, + "align_items": null, + "align_self": null, + "border": null, + "bottom": null, + "display": null, + "flex": null, + "flex_flow": null, + "grid_area": null, + "grid_auto_columns": null, + "grid_auto_flow": null, + "grid_auto_rows": null, + "grid_column": null, + "grid_gap": null, + "grid_row": null, + "grid_template_areas": null, + "grid_template_columns": null, + "grid_template_rows": null, + "height": null, + "justify_content": null, + "justify_items": null, + "left": null, + "margin": null, + "max_height": null, + "max_width": null, + "min_height": null, + "min_width": null, + "object_fit": null, + "object_position": null, + "order": null, + "overflow": null, + "overflow_x": null, + "overflow_y": null, + "padding": null, + "right": null, + "top": null, + "visibility": null, + "width": null + } + }, + "cd379c6435394897a3906d4bc6b71345": { + "model_module": "@jupyter-widgets/controls", + "model_name": "DescriptionStyleModel", + "model_module_version": "1.5.0", + "state": { + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "DescriptionStyleModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "StyleView", + "description_width": "" + } + }, + "5d198996c0344b8f92409b1cbff19af4": { + "model_module": "@jupyter-widgets/base", + "model_name": "LayoutModel", + "model_module_version": "1.2.0", + "state": { + "_model_module": "@jupyter-widgets/base", + "_model_module_version": "1.2.0", + "_model_name": "LayoutModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "LayoutView", + "align_content": null, + "align_items": null, + "align_self": null, + "border": null, + "bottom": null, + "display": null, + "flex": null, + "flex_flow": null, + "grid_area": null, + "grid_auto_columns": null, + "grid_auto_flow": null, + "grid_auto_rows": null, + "grid_column": null, + "grid_gap": null, + "grid_row": null, + "grid_template_areas": null, + "grid_template_columns": null, + "grid_template_rows": null, + "height": null, + "justify_content": null, + "justify_items": null, + "left": null, + "margin": null, + "max_height": null, + "max_width": null, + "min_height": null, + "min_width": null, + "object_fit": null, + "object_position": null, + "order": null, + "overflow": null, + "overflow_x": null, + "overflow_y": null, + "padding": null, + "right": null, + "top": null, + "visibility": null, + "width": null + } + }, + "d965c2e080bd452a99954a3b3af2c147": { + "model_module": "@jupyter-widgets/controls", + "model_name": "ProgressStyleModel", + "model_module_version": "1.5.0", + "state": { + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "ProgressStyleModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "StyleView", + "bar_color": null, + "description_width": "" + } + }, + "b79886293822421895753aa0e2d2d170": { + "model_module": "@jupyter-widgets/base", + "model_name": "LayoutModel", + "model_module_version": "1.2.0", + "state": { + "_model_module": "@jupyter-widgets/base", + "_model_module_version": "1.2.0", + "_model_name": "LayoutModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "LayoutView", + "align_content": null, + "align_items": null, + "align_self": null, + "border": null, + "bottom": null, + "display": null, + "flex": null, + "flex_flow": null, + "grid_area": null, + "grid_auto_columns": null, + "grid_auto_flow": null, + "grid_auto_rows": null, + "grid_column": null, + "grid_gap": null, + "grid_row": null, + "grid_template_areas": null, + "grid_template_columns": null, + "grid_template_rows": null, + "height": null, + "justify_content": null, + "justify_items": null, + "left": null, + "margin": null, + "max_height": null, + "max_width": null, + "min_height": null, + "min_width": null, + "object_fit": null, + "object_position": null, + "order": null, + "overflow": null, + "overflow_x": null, + "overflow_y": null, + "padding": null, + "right": null, + "top": null, + "visibility": null, + "width": null + } + }, + "f27d00a5f5964ee684787d97f132834a": { + "model_module": "@jupyter-widgets/controls", + "model_name": "DescriptionStyleModel", + "model_module_version": "1.5.0", + "state": { + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "DescriptionStyleModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "StyleView", + "description_width": "" + } + }, + "2f214f129655487baff53148920c7c95": { + "model_module": "@jupyter-widgets/controls", + "model_name": "HBoxModel", + "model_module_version": "1.5.0", + "state": { + "_dom_classes": [], + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "HBoxModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/controls", + "_view_module_version": "1.5.0", + "_view_name": "HBoxView", + "box_style": "", + "children": [ + "IPY_MODEL_ef32e3942ca14835aaa1fcd1bfc80018", + "IPY_MODEL_43d505d0a0294e4abd4878f53d7f7bc8", + "IPY_MODEL_297ed97f4dbb43aba188a337e1d8bf26" + ], + "layout": "IPY_MODEL_0e2a8d04b5a143bc9bfa77f7b538954e" + } + }, + "ef32e3942ca14835aaa1fcd1bfc80018": { + "model_module": "@jupyter-widgets/controls", + "model_name": "HTMLModel", + "model_module_version": "1.5.0", + "state": { + "_dom_classes": [], + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "HTMLModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/controls", + "_view_module_version": "1.5.0", + "_view_name": "HTMLView", + "description": "", + "description_tooltip": null, + "layout": "IPY_MODEL_f99a48ae06114986aa4249d0876962a8", + "placeholder": "​", + "style": "IPY_MODEL_025cd7df6ab64885bb7939f28a28be28", + "value": "Downloading (…)neration_config.json: 100%" + } + }, + "43d505d0a0294e4abd4878f53d7f7bc8": { + "model_module": "@jupyter-widgets/controls", + "model_name": "FloatProgressModel", + "model_module_version": "1.5.0", + "state": { + "_dom_classes": [], + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "FloatProgressModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/controls", + "_view_module_version": "1.5.0", + "_view_name": "ProgressView", + "bar_style": "success", + "description": "", + "description_tooltip": null, + "layout": "IPY_MODEL_71401278061d4f55af185007f7e21289", + "max": 147, + "min": 0, + "orientation": "horizontal", + "style": "IPY_MODEL_36e7f52522764ab9bd5669366c49a8ea", + "value": 147 + } + }, + "297ed97f4dbb43aba188a337e1d8bf26": { + "model_module": "@jupyter-widgets/controls", + "model_name": "HTMLModel", + "model_module_version": "1.5.0", + "state": { + "_dom_classes": [], + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "HTMLModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/controls", + "_view_module_version": "1.5.0", + "_view_name": "HTMLView", + "description": "", + "description_tooltip": null, + "layout": "IPY_MODEL_2d9b52f788d34d4d94b6e1b17f0a6ace", + "placeholder": "​", + "style": "IPY_MODEL_5c662a378c2a4529a0c742fb0a0210fe", + "value": " 147/147 [00:00<00:00, 8.15kB/s]" + } + }, + "0e2a8d04b5a143bc9bfa77f7b538954e": { + "model_module": "@jupyter-widgets/base", + "model_name": "LayoutModel", + "model_module_version": "1.2.0", + "state": { + "_model_module": "@jupyter-widgets/base", + "_model_module_version": "1.2.0", + "_model_name": "LayoutModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "LayoutView", + "align_content": null, + "align_items": null, + "align_self": null, + "border": null, + "bottom": null, + "display": null, + "flex": null, + "flex_flow": null, + "grid_area": null, + "grid_auto_columns": null, + "grid_auto_flow": null, + "grid_auto_rows": null, + "grid_column": null, + "grid_gap": null, + "grid_row": null, + "grid_template_areas": null, + "grid_template_columns": null, + "grid_template_rows": null, + "height": null, + "justify_content": null, + "justify_items": null, + "left": null, + "margin": null, + "max_height": null, + "max_width": null, + "min_height": null, + "min_width": null, + "object_fit": null, + "object_position": null, + "order": null, + "overflow": null, + "overflow_x": null, + "overflow_y": null, + "padding": null, + "right": null, + "top": null, + "visibility": null, + "width": null + } + }, + "f99a48ae06114986aa4249d0876962a8": { + "model_module": "@jupyter-widgets/base", + "model_name": "LayoutModel", + "model_module_version": "1.2.0", + "state": { + "_model_module": "@jupyter-widgets/base", + "_model_module_version": "1.2.0", + "_model_name": "LayoutModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "LayoutView", + "align_content": null, + "align_items": null, + "align_self": null, + "border": null, + "bottom": null, + "display": null, + "flex": null, + "flex_flow": null, + "grid_area": null, + "grid_auto_columns": null, + "grid_auto_flow": null, + "grid_auto_rows": null, + "grid_column": null, + "grid_gap": null, + "grid_row": null, + "grid_template_areas": null, + "grid_template_columns": null, + "grid_template_rows": null, + "height": null, + "justify_content": null, + "justify_items": null, + "left": null, + "margin": null, + "max_height": null, + "max_width": null, + "min_height": null, + "min_width": null, + "object_fit": null, + "object_position": null, + "order": null, + "overflow": null, + "overflow_x": null, + "overflow_y": null, + "padding": null, + "right": null, + "top": null, + "visibility": null, + "width": null + } + }, + "025cd7df6ab64885bb7939f28a28be28": { + "model_module": "@jupyter-widgets/controls", + "model_name": "DescriptionStyleModel", + "model_module_version": "1.5.0", + "state": { + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "DescriptionStyleModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "StyleView", + "description_width": "" + } + }, + "71401278061d4f55af185007f7e21289": { + "model_module": "@jupyter-widgets/base", + "model_name": "LayoutModel", + "model_module_version": "1.2.0", + "state": { + "_model_module": "@jupyter-widgets/base", + "_model_module_version": "1.2.0", + "_model_name": "LayoutModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "LayoutView", + "align_content": null, + "align_items": null, + "align_self": null, + "border": null, + "bottom": null, + "display": null, + "flex": null, + "flex_flow": null, + "grid_area": null, + "grid_auto_columns": null, + "grid_auto_flow": null, + "grid_auto_rows": null, + "grid_column": null, + "grid_gap": null, + "grid_row": null, + "grid_template_areas": null, + "grid_template_columns": null, + "grid_template_rows": null, + "height": null, + "justify_content": null, + "justify_items": null, + "left": null, + "margin": null, + "max_height": null, + "max_width": null, + "min_height": null, + "min_width": null, + "object_fit": null, + "object_position": null, + "order": null, + "overflow": null, + "overflow_x": null, + "overflow_y": null, + "padding": null, + "right": null, + "top": null, + "visibility": null, + "width": null + } + }, + "36e7f52522764ab9bd5669366c49a8ea": { + "model_module": "@jupyter-widgets/controls", + "model_name": "ProgressStyleModel", + "model_module_version": "1.5.0", + "state": { + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "ProgressStyleModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "StyleView", + "bar_color": null, + "description_width": "" + } + }, + "2d9b52f788d34d4d94b6e1b17f0a6ace": { + "model_module": "@jupyter-widgets/base", + "model_name": "LayoutModel", + "model_module_version": "1.2.0", + "state": { + "_model_module": "@jupyter-widgets/base", + "_model_module_version": "1.2.0", + "_model_name": "LayoutModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "LayoutView", + "align_content": null, + "align_items": null, + "align_self": null, + "border": null, + "bottom": null, + "display": null, + "flex": null, + "flex_flow": null, + "grid_area": null, + "grid_auto_columns": null, + "grid_auto_flow": null, + "grid_auto_rows": null, + "grid_column": null, + "grid_gap": null, + "grid_row": null, + "grid_template_areas": null, + "grid_template_columns": null, + "grid_template_rows": null, + "height": null, + "justify_content": null, + "justify_items": null, + "left": null, + "margin": null, + "max_height": null, + "max_width": null, + "min_height": null, + "min_width": null, + "object_fit": null, + "object_position": null, + "order": null, + "overflow": null, + "overflow_x": null, + "overflow_y": null, + "padding": null, + "right": null, + "top": null, + "visibility": null, + "width": null + } + }, + "5c662a378c2a4529a0c742fb0a0210fe": { + "model_module": "@jupyter-widgets/controls", + "model_name": "DescriptionStyleModel", + "model_module_version": "1.5.0", + "state": { + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "DescriptionStyleModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "StyleView", + "description_width": "" + } + }, + "db160865302c401facf54afd53f972d4": { + "model_module": "@jupyter-widgets/controls", + "model_name": "HBoxModel", + "model_module_version": "1.5.0", + "state": { + "_dom_classes": [], + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "HBoxModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/controls", + "_view_module_version": "1.5.0", + "_view_name": "HBoxView", + "box_style": "", + "children": [ + "IPY_MODEL_60d5faa861404744bd685257af4be791", + "IPY_MODEL_de92bc34f9d04ad18e0bbfb8cfdc4e3b", + "IPY_MODEL_077954e270e444caa52c3f6bbcf20817" + ], + "layout": "IPY_MODEL_8ecb7100390e4d0cb3425b173dcf07f1" + } + }, + "60d5faa861404744bd685257af4be791": { + "model_module": "@jupyter-widgets/controls", + "model_name": "HTMLModel", + "model_module_version": "1.5.0", + "state": { + "_dom_classes": [], + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "HTMLModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/controls", + "_view_module_version": "1.5.0", + "_view_name": "HTMLView", + "description": "", + "description_tooltip": null, + "layout": "IPY_MODEL_12a0cb44ed93433ebbc9f6c80c4e494c", + "placeholder": "​", + "style": "IPY_MODEL_a7b74c6b00384f3f905e3fb8a8c3f216", + "value": "Downloading spiece.model: 100%" + } + }, + "de92bc34f9d04ad18e0bbfb8cfdc4e3b": { + "model_module": "@jupyter-widgets/controls", + "model_name": "FloatProgressModel", + "model_module_version": "1.5.0", + "state": { + "_dom_classes": [], + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "FloatProgressModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/controls", + "_view_module_version": "1.5.0", + "_view_name": "ProgressView", + "bar_style": "success", + "description": "", + "description_tooltip": null, + "layout": "IPY_MODEL_c92cfb1a34af4d35ab4a1cbb552f1a8b", + "max": 791656, + "min": 0, + "orientation": "horizontal", + "style": "IPY_MODEL_dea195394d484e6f96e097ac45eb926a", + "value": 791656 + } + }, + "077954e270e444caa52c3f6bbcf20817": { + "model_module": "@jupyter-widgets/controls", + "model_name": "HTMLModel", + "model_module_version": "1.5.0", + "state": { + "_dom_classes": [], + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "HTMLModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/controls", + "_view_module_version": "1.5.0", + "_view_name": "HTMLView", + "description": "", + "description_tooltip": null, + "layout": "IPY_MODEL_7a381e3cefcf48a1bc188c49985f2bf2", + "placeholder": "​", + "style": "IPY_MODEL_40e7ef29f37e414cb27fc6798ec3dd69", + "value": " 792k/792k [00:00<00:00, 41.4MB/s]" + } + }, + "8ecb7100390e4d0cb3425b173dcf07f1": { + "model_module": "@jupyter-widgets/base", + "model_name": "LayoutModel", + "model_module_version": "1.2.0", + "state": { + "_model_module": "@jupyter-widgets/base", + "_model_module_version": "1.2.0", + "_model_name": "LayoutModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "LayoutView", + "align_content": null, + "align_items": null, + "align_self": null, + "border": null, + "bottom": null, + "display": null, + "flex": null, + "flex_flow": null, + "grid_area": null, + "grid_auto_columns": null, + "grid_auto_flow": null, + "grid_auto_rows": null, + "grid_column": null, + "grid_gap": null, + "grid_row": null, + "grid_template_areas": null, + "grid_template_columns": null, + "grid_template_rows": null, + "height": null, + "justify_content": null, + "justify_items": null, + "left": null, + "margin": null, + "max_height": null, + "max_width": null, + "min_height": null, + "min_width": null, + "object_fit": null, + "object_position": null, + "order": null, + "overflow": null, + "overflow_x": null, + "overflow_y": null, + "padding": null, + "right": null, + "top": null, + "visibility": null, + "width": null + } + }, + "12a0cb44ed93433ebbc9f6c80c4e494c": { + "model_module": "@jupyter-widgets/base", + "model_name": "LayoutModel", + "model_module_version": "1.2.0", + "state": { + "_model_module": "@jupyter-widgets/base", + "_model_module_version": "1.2.0", + "_model_name": "LayoutModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "LayoutView", + "align_content": null, + "align_items": null, + "align_self": null, + "border": null, + "bottom": null, + "display": null, + "flex": null, + "flex_flow": null, + "grid_area": null, + "grid_auto_columns": null, + "grid_auto_flow": null, + "grid_auto_rows": null, + "grid_column": null, + "grid_gap": null, + "grid_row": null, + "grid_template_areas": null, + "grid_template_columns": null, + "grid_template_rows": null, + "height": null, + "justify_content": null, + "justify_items": null, + "left": null, + "margin": null, + "max_height": null, + "max_width": null, + "min_height": null, + "min_width": null, + "object_fit": null, + "object_position": null, + "order": null, + "overflow": null, + "overflow_x": null, + "overflow_y": null, + "padding": null, + "right": null, + "top": null, + "visibility": null, + "width": null + } + }, + "a7b74c6b00384f3f905e3fb8a8c3f216": { + "model_module": "@jupyter-widgets/controls", + "model_name": "DescriptionStyleModel", + "model_module_version": "1.5.0", + "state": { + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "DescriptionStyleModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "StyleView", + "description_width": "" + } + }, + "c92cfb1a34af4d35ab4a1cbb552f1a8b": { + "model_module": "@jupyter-widgets/base", + "model_name": "LayoutModel", + "model_module_version": "1.2.0", + "state": { + "_model_module": "@jupyter-widgets/base", + "_model_module_version": "1.2.0", + "_model_name": "LayoutModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "LayoutView", + "align_content": null, + "align_items": null, + "align_self": null, + "border": null, + "bottom": null, + "display": null, + "flex": null, + "flex_flow": null, + "grid_area": null, + "grid_auto_columns": null, + "grid_auto_flow": null, + "grid_auto_rows": null, + "grid_column": null, + "grid_gap": null, + "grid_row": null, + "grid_template_areas": null, + "grid_template_columns": null, + "grid_template_rows": null, + "height": null, + "justify_content": null, + "justify_items": null, + "left": null, + "margin": null, + "max_height": null, + "max_width": null, + "min_height": null, + "min_width": null, + "object_fit": null, + "object_position": null, + "order": null, + "overflow": null, + "overflow_x": null, + "overflow_y": null, + "padding": null, + "right": null, + "top": null, + "visibility": null, + "width": null + } + }, + "dea195394d484e6f96e097ac45eb926a": { + "model_module": "@jupyter-widgets/controls", + "model_name": "ProgressStyleModel", + "model_module_version": "1.5.0", + "state": { + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "ProgressStyleModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "StyleView", + "bar_color": null, + "description_width": "" + } + }, + "7a381e3cefcf48a1bc188c49985f2bf2": { + "model_module": "@jupyter-widgets/base", + "model_name": "LayoutModel", + "model_module_version": "1.2.0", + "state": { + "_model_module": "@jupyter-widgets/base", + "_model_module_version": "1.2.0", + "_model_name": "LayoutModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "LayoutView", + "align_content": null, + "align_items": null, + "align_self": null, + "border": null, + "bottom": null, + "display": null, + "flex": null, + "flex_flow": null, + "grid_area": null, + "grid_auto_columns": null, + "grid_auto_flow": null, + "grid_auto_rows": null, + "grid_column": null, + "grid_gap": null, + "grid_row": null, + "grid_template_areas": null, + "grid_template_columns": null, + "grid_template_rows": null, + "height": null, + "justify_content": null, + "justify_items": null, + "left": null, + "margin": null, + "max_height": null, + "max_width": null, + "min_height": null, + "min_width": null, + "object_fit": null, + "object_position": null, + "order": null, + "overflow": null, + "overflow_x": null, + "overflow_y": null, + "padding": null, + "right": null, + "top": null, + "visibility": null, + "width": null + } + }, + "40e7ef29f37e414cb27fc6798ec3dd69": { + "model_module": "@jupyter-widgets/controls", + "model_name": "DescriptionStyleModel", + "model_module_version": "1.5.0", + "state": { + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "DescriptionStyleModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "StyleView", + "description_width": "" + } + }, + "c1aadb4cc2754a56af360a1534dc9c07": { + "model_module": "@jupyter-widgets/controls", + "model_name": "HBoxModel", + "model_module_version": "1.5.0", + "state": { + "_dom_classes": [], + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "HBoxModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/controls", + "_view_module_version": "1.5.0", + "_view_name": "HBoxView", + "box_style": "", + "children": [ + "IPY_MODEL_b9874db4b125444c9c1e461ec982ab1d", + "IPY_MODEL_e78eef1b3587420cb805c2899d3d50f5", + "IPY_MODEL_a51b630132cc4f8dacd2b59666ca0640" + ], + "layout": "IPY_MODEL_e5c5b8a19d5b46269c203e313ac56ca2" + } + }, + "b9874db4b125444c9c1e461ec982ab1d": { + "model_module": "@jupyter-widgets/controls", + "model_name": "HTMLModel", + "model_module_version": "1.5.0", + "state": { + "_dom_classes": [], + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "HTMLModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/controls", + "_view_module_version": "1.5.0", + "_view_name": "HTMLView", + "description": "", + "description_tooltip": null, + "layout": "IPY_MODEL_3b92d3a3cb0e4c19bf9fead4422a5554", + "placeholder": "​", + "style": "IPY_MODEL_34ad0d26150f4b589149d4ece6bf877a", + "value": "Downloading (…)/main/tokenizer.json: 100%" + } + }, + "e78eef1b3587420cb805c2899d3d50f5": { + "model_module": "@jupyter-widgets/controls", + "model_name": "FloatProgressModel", + "model_module_version": "1.5.0", + "state": { + "_dom_classes": [], + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "FloatProgressModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/controls", + "_view_module_version": "1.5.0", + "_view_name": "ProgressView", + "bar_style": "success", + "description": "", + "description_tooltip": null, + "layout": "IPY_MODEL_d6a31ea687064ceebf54819168df219c", + "max": 1389353, + "min": 0, + "orientation": "horizontal", + "style": "IPY_MODEL_f88fc2141c524793b3266f2ead90b42b", + "value": 1389353 + } + }, + "a51b630132cc4f8dacd2b59666ca0640": { + "model_module": "@jupyter-widgets/controls", + "model_name": "HTMLModel", + "model_module_version": "1.5.0", + "state": { + "_dom_classes": [], + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "HTMLModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/controls", + "_view_module_version": "1.5.0", + "_view_name": "HTMLView", + "description": "", + "description_tooltip": null, + "layout": "IPY_MODEL_d868549bbf9241e8b20a922ad5426b3e", + "placeholder": "​", + "style": "IPY_MODEL_4fa6dc96da8f49929249cc061a052167", + "value": " 1.39M/1.39M [00:00<00:00, 15.0MB/s]" + } + }, + "e5c5b8a19d5b46269c203e313ac56ca2": { + "model_module": "@jupyter-widgets/base", + "model_name": "LayoutModel", + "model_module_version": "1.2.0", + "state": { + "_model_module": "@jupyter-widgets/base", + "_model_module_version": "1.2.0", + "_model_name": "LayoutModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "LayoutView", + "align_content": null, + "align_items": null, + "align_self": null, + "border": null, + "bottom": null, + "display": null, + "flex": null, + "flex_flow": null, + "grid_area": null, + "grid_auto_columns": null, + "grid_auto_flow": null, + "grid_auto_rows": null, + "grid_column": null, + "grid_gap": null, + "grid_row": null, + "grid_template_areas": null, + "grid_template_columns": null, + "grid_template_rows": null, + "height": null, + "justify_content": null, + "justify_items": null, + "left": null, + "margin": null, + "max_height": null, + "max_width": null, + "min_height": null, + "min_width": null, + "object_fit": null, + "object_position": null, + "order": null, + "overflow": null, + "overflow_x": null, + "overflow_y": null, + "padding": null, + "right": null, + "top": null, + "visibility": null, + "width": null + } + }, + "3b92d3a3cb0e4c19bf9fead4422a5554": { + "model_module": "@jupyter-widgets/base", + "model_name": "LayoutModel", + "model_module_version": "1.2.0", + "state": { + "_model_module": "@jupyter-widgets/base", + "_model_module_version": "1.2.0", + "_model_name": "LayoutModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "LayoutView", + "align_content": null, + "align_items": null, + "align_self": null, + "border": null, + "bottom": null, + "display": null, + "flex": null, + "flex_flow": null, + "grid_area": null, + "grid_auto_columns": null, + "grid_auto_flow": null, + "grid_auto_rows": null, + "grid_column": null, + "grid_gap": null, + "grid_row": null, + "grid_template_areas": null, + "grid_template_columns": null, + "grid_template_rows": null, + "height": null, + "justify_content": null, + "justify_items": null, + "left": null, + "margin": null, + "max_height": null, + "max_width": null, + "min_height": null, + "min_width": null, + "object_fit": null, + "object_position": null, + "order": null, + "overflow": null, + "overflow_x": null, + "overflow_y": null, + "padding": null, + "right": null, + "top": null, + "visibility": null, + "width": null + } + }, + "34ad0d26150f4b589149d4ece6bf877a": { + "model_module": "@jupyter-widgets/controls", + "model_name": "DescriptionStyleModel", + "model_module_version": "1.5.0", + "state": { + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "DescriptionStyleModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "StyleView", + "description_width": "" + } + }, + "d6a31ea687064ceebf54819168df219c": { + "model_module": "@jupyter-widgets/base", + "model_name": "LayoutModel", + "model_module_version": "1.2.0", + "state": { + "_model_module": "@jupyter-widgets/base", + "_model_module_version": "1.2.0", + "_model_name": "LayoutModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "LayoutView", + "align_content": null, + "align_items": null, + "align_self": null, + "border": null, + "bottom": null, + "display": null, + "flex": null, + "flex_flow": null, + "grid_area": null, + "grid_auto_columns": null, + "grid_auto_flow": null, + "grid_auto_rows": null, + "grid_column": null, + "grid_gap": null, + "grid_row": null, + "grid_template_areas": null, + "grid_template_columns": null, + "grid_template_rows": null, + "height": null, + "justify_content": null, + "justify_items": null, + "left": null, + "margin": null, + "max_height": null, + "max_width": null, + "min_height": null, + "min_width": null, + "object_fit": null, + "object_position": null, + "order": null, + "overflow": null, + "overflow_x": null, + "overflow_y": null, + "padding": null, + "right": null, + "top": null, + "visibility": null, + "width": null + } + }, + "f88fc2141c524793b3266f2ead90b42b": { + "model_module": "@jupyter-widgets/controls", + "model_name": "ProgressStyleModel", + "model_module_version": "1.5.0", + "state": { + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "ProgressStyleModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "StyleView", + "bar_color": null, + "description_width": "" + } + }, + "d868549bbf9241e8b20a922ad5426b3e": { + "model_module": "@jupyter-widgets/base", + "model_name": "LayoutModel", + "model_module_version": "1.2.0", + "state": { + "_model_module": "@jupyter-widgets/base", + "_model_module_version": "1.2.0", + "_model_name": "LayoutModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "LayoutView", + "align_content": null, + "align_items": null, + "align_self": null, + "border": null, + "bottom": null, + "display": null, + "flex": null, + "flex_flow": null, + "grid_area": null, + "grid_auto_columns": null, + "grid_auto_flow": null, + "grid_auto_rows": null, + "grid_column": null, + "grid_gap": null, + "grid_row": null, + "grid_template_areas": null, + "grid_template_columns": null, + "grid_template_rows": null, + "height": null, + "justify_content": null, + "justify_items": null, + "left": null, + "margin": null, + "max_height": null, + "max_width": null, + "min_height": null, + "min_width": null, + "object_fit": null, + "object_position": null, + "order": null, + "overflow": null, + "overflow_x": null, + "overflow_y": null, + "padding": null, + "right": null, + "top": null, + "visibility": null, + "width": null + } + }, + "4fa6dc96da8f49929249cc061a052167": { + "model_module": "@jupyter-widgets/controls", + "model_name": "DescriptionStyleModel", + "model_module_version": "1.5.0", + "state": { + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "DescriptionStyleModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "StyleView", + "description_width": "" + } + } + } + } + }, + "nbformat": 4, + "nbformat_minor": 0 +} \ No newline at end of file