|
6 | 6 | "id": "5918b41c-dad7-4f7b-9e39-b3026933dddf", |
7 | 7 | "metadata": {}, |
8 | 8 | "source": [ |
9 | | - "# Visual-language assistant with MiniCPM-V2 and OpenVINO\n", |
| 9 | + "# Visual-language assistant with MiniCPM-V and OpenVINO\n", |
10 | 10 | "\n", |
11 | | - "MiniCPM-V 2 is a strong multimodal large language model for efficient end-side deployment. MiniCPM-V 2.6 is the latest and most capable model in the MiniCPM-V series. The model is built on SigLip-400M and Qwen2-7B with a total of 8B parameters. It exhibits a significant performance improvement over previous versions, and introduces new features for multi-image and video understanding.\n", |
| 11 | + "MiniCPM-V 4.0 is the latest efficient model in the MiniCPM-V series. The model is built based on SigLIP2-400M and MiniCPM4-3B with a total of 4.1B parameters. It inherits the strong single-image, multi-image and video understanding performance of MiniCPM-V 2.6 with largely improved efficiency. \n", |
| 12 | + "More details about model can be found in [model card](https://huggingface.co/openbmb/MiniCPM-V-4) and original [repo](https://github.com/OpenBMB/MiniCPM-V).\n", |
12 | 13 | "\n", |
13 | | - "More details about model can be found in [model card](https://huggingface.co/openbmb/MiniCPM-V-2_6) and original [repo](https://github.com/OpenBMB/MiniCPM-V).\n", |
14 | 14 | "\n", |
15 | | - "In this tutorial we consider how to convert and optimize MiniCPM-V2 model for creating multimodal chatbot. Additionally, we demonstrate how to apply stateful transformation on LLM part and model optimization techniques like weights compression using [NNCF](https://github.com/openvinotoolkit/nncf)\n", |
| 15 | + "In this tutorial we consider how to convert and optimize MiniCPM-V model for creating multimodal chatbot. Additionally, we demonstrate how to apply stateful transformation on LLM part and model optimization techniques like weights compression using [NNCF](https://github.com/openvinotoolkit/nncf)\n", |
16 | 16 | "\n", |
17 | 17 | "#### Table of contents:\n", |
18 | 18 | "\n", |
19 | 19 | "- [Prerequisites](#Prerequisites)\n", |
20 | 20 | "- [Convert model to OpenVINO Intermediate Representation](#Convert-model-to-OpenVINO-Intermediate-Representation)\n", |
| 21 | + " - [Select model](#Select-model)\n", |
21 | 22 | " - [Compress Language Model Weights to 4 bits](#Compress-Language-Model-Weights-to-4-bits)\n", |
22 | 23 | "- [Prepare model inference pipeline](#Prepare-model-inference-pipeline)\n", |
23 | 24 | " - [Select device](#Select-device)\n", |
|
47 | 48 | }, |
48 | 49 | { |
49 | 50 | "cell_type": "code", |
50 | | - "execution_count": 1, |
| 51 | + "execution_count": null, |
51 | 52 | "id": "0116846d-da6f-4e81-b6be-0a882a3eb872", |
52 | 53 | "metadata": {}, |
53 | | - "outputs": [], |
| 54 | + "outputs": [ |
| 55 | + { |
| 56 | + "name": "stdout", |
| 57 | + "output_type": "stream", |
| 58 | + "text": [ |
| 59 | + "\n", |
| 60 | + "\u001b[1m[\u001b[0m\u001b[34;49mnotice\u001b[0m\u001b[1;39;49m]\u001b[0m\u001b[39;49m A new release of pip is available: \u001b[0m\u001b[31;49m25.1.1\u001b[0m\u001b[39;49m -> \u001b[0m\u001b[32;49m25.2\u001b[0m\n", |
| 61 | + "\u001b[1m[\u001b[0m\u001b[34;49mnotice\u001b[0m\u001b[1;39;49m]\u001b[0m\u001b[39;49m To update, run: \u001b[0m\u001b[32;49mpip install --upgrade pip\u001b[0m\n", |
| 62 | + "Note: you may need to restart the kernel to use updated packages.\n", |
| 63 | + "\n", |
| 64 | + "\u001b[1m[\u001b[0m\u001b[34;49mnotice\u001b[0m\u001b[1;39;49m]\u001b[0m\u001b[39;49m A new release of pip is available: \u001b[0m\u001b[31;49m25.1.1\u001b[0m\u001b[39;49m -> \u001b[0m\u001b[32;49m25.2\u001b[0m\n", |
| 65 | + "\u001b[1m[\u001b[0m\u001b[34;49mnotice\u001b[0m\u001b[1;39;49m]\u001b[0m\u001b[39;49m To update, run: \u001b[0m\u001b[32;49mpip install --upgrade pip\u001b[0m\n", |
| 66 | + "Note: you may need to restart the kernel to use updated packages.\n", |
| 67 | + "\n", |
| 68 | + "\u001b[1m[\u001b[0m\u001b[34;49mnotice\u001b[0m\u001b[1;39;49m]\u001b[0m\u001b[39;49m A new release of pip is available: \u001b[0m\u001b[31;49m25.1.1\u001b[0m\u001b[39;49m -> \u001b[0m\u001b[32;49m25.2\u001b[0m\n", |
| 69 | + "\u001b[1m[\u001b[0m\u001b[34;49mnotice\u001b[0m\u001b[1;39;49m]\u001b[0m\u001b[39;49m To update, run: \u001b[0m\u001b[32;49mpip install --upgrade pip\u001b[0m\n", |
| 70 | + "Note: you may need to restart the kernel to use updated packages.\n", |
| 71 | + "\u001b[31mERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.\n", |
| 72 | + "optimum-intel 1.26.0.dev0+7c64417 requires optimum==1.27.*, but you have optimum 2.0.0.dev0 which is incompatible.\u001b[0m\u001b[31m\n", |
| 73 | + "\u001b[0m\n", |
| 74 | + "\u001b[1m[\u001b[0m\u001b[34;49mnotice\u001b[0m\u001b[1;39;49m]\u001b[0m\u001b[39;49m A new release of pip is available: \u001b[0m\u001b[31;49m25.1.1\u001b[0m\u001b[39;49m -> \u001b[0m\u001b[32;49m25.2\u001b[0m\n", |
| 75 | + "\u001b[1m[\u001b[0m\u001b[34;49mnotice\u001b[0m\u001b[1;39;49m]\u001b[0m\u001b[39;49m To update, run: \u001b[0m\u001b[32;49mpip install --upgrade pip\u001b[0m\n", |
| 76 | + "Note: you may need to restart the kernel to use updated packages.\n", |
| 77 | + "\n", |
| 78 | + "\u001b[1m[\u001b[0m\u001b[34;49mnotice\u001b[0m\u001b[1;39;49m]\u001b[0m\u001b[39;49m A new release of pip is available: \u001b[0m\u001b[31;49m25.1.1\u001b[0m\u001b[39;49m -> \u001b[0m\u001b[32;49m25.2\u001b[0m\n", |
| 79 | + "\u001b[1m[\u001b[0m\u001b[34;49mnotice\u001b[0m\u001b[1;39;49m]\u001b[0m\u001b[39;49m To update, run: \u001b[0m\u001b[32;49mpip install --upgrade pip\u001b[0m\n", |
| 80 | + "Note: you may need to restart the kernel to use updated packages.\n" |
| 81 | + ] |
| 82 | + } |
| 83 | + ], |
54 | 84 | "source": [ |
| 85 | + "import platform\n", |
| 86 | + "\n", |
| 87 | + "if platform.system() == \"Darwin\":\n", |
| 88 | + " %pip install -q \"numpy<2.0.0\"\n", |
| 89 | + "\n", |
55 | 90 | "%pip install -q \"torch>=2.1\" \"torchvision\" \"timm>=0.9.2\" \"transformers>=4.45\" \"Pillow\" \"gradio>=4.40\" \"tqdm\" \"sentencepiece\" \"peft\" \"huggingface-hub>=0.24.0\" --extra-index-url https://download.pytorch.org/whl/cpu\n", |
56 | 91 | "%pip install -q \"nncf>=2.14.0\"\n", |
57 | | - "%pip install -q \"git+https://github.com/huggingface/optimum-intel.git\" --extra-index-url https://download.pytorch.org/whl/cpu\n", |
| 92 | + "%pip install -q \"git+https://github.com/openvino-dev-samples/optimum-intel.git@minicpm4v\" --extra-index-url https://download.pytorch.org/whl/cpu\n", |
| 93 | + "%pip install -q \"git+https://github.com/openvino-dev-samples/optimum.git@minicpm4v\" --extra-index-url https://download.pytorch.org/whl/cpu\n", |
58 | 94 | "%pip install -q -U --pre \"openvino>=2025.0\" \"openvino-tokenizers>=2025.0\" \"openvino-genai>=2025.0\" --extra-index-url https://storage.openvinotoolkit.org/simple/wheels/nightly" |
59 | 95 | ] |
60 | 96 | }, |
|
110 | 146 | "\n", |
111 | 147 | "where task is task to export the model for, if not specified, the task will be auto-inferred based on the model. You can find a mapping between tasks and model classes in Optimum TaskManager [documentation](https://huggingface.co/docs/optimum/exporters/task_manager). Additionally, you can specify weights compression using `--weight-format` argument with one of following options: `fp32`, `fp16`, `int8` and `int4`. Fro int8 and int4 [nncf](https://github.com/openvinotoolkit/nncf) will be used for weight compression. More details about model export provided in [Optimum Intel documentation](https://huggingface.co/docs/optimum/intel/openvino/export#export-your-model).\n", |
112 | 148 | "\n", |
| 149 | + "## Select model\n", |
| 150 | + "[back to top ⬆️](#Table-of-contents:)\n", |
| 151 | + "\n", |
| 152 | + "* **MiniCPM-V-4**: MiniCPM-V 4.0 is the latest efficient model in the MiniCPM-V series. The model is built based on SigLIP2-400M and MiniCPM4-3B with a total of 4.1B parameters. It inherits the strong single-image, multi-image and video understanding performance of MiniCPM-V 2.6 with largely improved efficiency. \n", |
| 153 | + "* **MiniCPM-V-2_6**: MiniCPM-V 2.6 is built on SigLip-400M and Qwen2-7B with a total of 8B parameters. It exhibits a significant performance improvement over MiniCPM-Llama3-V 2.5, and introduces new features for multi-image and video understanding." |
| 154 | + ] |
| 155 | + }, |
| 156 | + { |
| 157 | + "cell_type": "code", |
| 158 | + "execution_count": 3, |
| 159 | + "id": "a0851b3c", |
| 160 | + "metadata": { |
| 161 | + "test_replace": { |
| 162 | + "openbmb/MiniCPM-V-4": "katuni4ka/tiny-random-minicpmv-2_6" |
| 163 | + } |
| 164 | + }, |
| 165 | + "outputs": [ |
| 166 | + { |
| 167 | + "data": { |
| 168 | + "application/vnd.jupyter.widget-view+json": { |
| 169 | + "model_id": "289c2574f5604076bdcd8eccabc4a14f", |
| 170 | + "version_major": 2, |
| 171 | + "version_minor": 0 |
| 172 | + }, |
| 173 | + "text/plain": [ |
| 174 | + "Dropdown(description='Model:', options=('openbmb/MiniCPM-V-4', 'openbmb/MiniCPM-V-2_6'), value='openbmb/MiniCP…" |
| 175 | + ] |
| 176 | + }, |
| 177 | + "execution_count": 3, |
| 178 | + "metadata": {}, |
| 179 | + "output_type": "execute_result" |
| 180 | + } |
| 181 | + ], |
| 182 | + "source": [ |
| 183 | + "import ipywidgets as widgets\n", |
| 184 | + "\n", |
| 185 | + "model_ids = [\"openbmb/MiniCPM-V-4\", \"openbmb/MiniCPM-V-2_6\"]\n", |
| 186 | + "\n", |
| 187 | + "model_selector = widgets.Dropdown(\n", |
| 188 | + " options=model_ids,\n", |
| 189 | + " default=model_ids[0],\n", |
| 190 | + " description=\"Model:\",\n", |
| 191 | + ")\n", |
| 192 | + "\n", |
| 193 | + "\n", |
| 194 | + "model_selector" |
| 195 | + ] |
| 196 | + }, |
| 197 | + { |
| 198 | + "cell_type": "markdown", |
| 199 | + "id": "59dcd94b", |
| 200 | + "metadata": {}, |
| 201 | + "source": [ |
113 | 202 | "### Compress Language Model Weights to 4 bits\n", |
114 | 203 | "[back to top ⬆️](#Table-of-contents:)\n", |
115 | 204 | "\n", |
|
134 | 223 | }, |
135 | 224 | { |
136 | 225 | "cell_type": "code", |
137 | | - "execution_count": 3, |
| 226 | + "execution_count": null, |
138 | 227 | "id": "82e846bb", |
139 | | - "metadata": { |
140 | | - "test_replace": { |
141 | | - "openbmb/MiniCPM-V-2_6": "katuni4ka/tiny-random-minicpmv-2_6" |
142 | | - } |
143 | | - }, |
| 228 | + "metadata": {}, |
144 | 229 | "outputs": [ |
| 230 | + { |
| 231 | + "data": { |
| 232 | + "text/markdown": [ |
| 233 | + "**Export command:**" |
| 234 | + ], |
| 235 | + "text/plain": [ |
| 236 | + "<IPython.core.display.Markdown object>" |
| 237 | + ] |
| 238 | + }, |
| 239 | + "metadata": {}, |
| 240 | + "output_type": "display_data" |
| 241 | + }, |
| 242 | + { |
| 243 | + "data": { |
| 244 | + "text/markdown": [ |
| 245 | + "`optimum-cli export openvino --model openbmb/MiniCPM-V-4 MiniCPM-V-4-ov --trust-remote-code --weight-format fp16 --task image-text-to-text`" |
| 246 | + ], |
| 247 | + "text/plain": [ |
| 248 | + "<IPython.core.display.Markdown object>" |
| 249 | + ] |
| 250 | + }, |
| 251 | + "metadata": {}, |
| 252 | + "output_type": "display_data" |
| 253 | + }, |
145 | 254 | { |
146 | 255 | "name": "stdout", |
147 | 256 | "output_type": "stream", |
148 | 257 | "text": [ |
149 | | - "INFO:nncf:NNCF initialized successfully. Supported frameworks detected: torch, tensorflow, onnx, openvino\n" |
| 258 | + "WARNING:nncf:NNCF provides best results with torch==2.7.*, while current torch version is 2.5.1+cpu. If you encounter issues, consider switching to torch==2.7.*\n", |
| 259 | + "INFO:nncf:Statistics of the bitwidth distribution:\n", |
| 260 | + "┍━━━━━━━━━━━━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┑\n", |
| 261 | + "│ Weight compression mode │ % all parameters (layers) │ % ratio-defining parameters (layers) │\n", |
| 262 | + "┝━━━━━━━━━━━━━━━━━━━━━━━━━━━┿━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┿━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┥\n", |
| 263 | + "│ int4_sym │ 100% (225 / 225) │ 100% (225 / 225) │\n", |
| 264 | + "┕━━━━━━━━━━━━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┙\n" |
150 | 265 | ] |
| 266 | + }, |
| 267 | + { |
| 268 | + "data": { |
| 269 | + "application/vnd.jupyter.widget-view+json": { |
| 270 | + "model_id": "e5a6ec13d42f41109d029aced33475ff", |
| 271 | + "version_major": 2, |
| 272 | + "version_minor": 0 |
| 273 | + }, |
| 274 | + "text/plain": [ |
| 275 | + "Output()" |
| 276 | + ] |
| 277 | + }, |
| 278 | + "metadata": {}, |
| 279 | + "output_type": "display_data" |
151 | 280 | } |
152 | 281 | ], |
153 | 282 | "source": [ |
|
174 | 303 | " shutil.move(ov_int4_model_path.with_suffix(\".bin\"), ov_model_path.with_suffix(\".bin\"))\n", |
175 | 304 | "\n", |
176 | 305 | "\n", |
177 | | - "model_id = \"openbmb/MiniCPM-V-2_6\"\n", |
178 | | - "model_dir = Path(model_id.split(\"/\")[-1] + \"-ov\")\n", |
| 306 | + "model_dir = Path(model_selector.value.split(\"/\")[-1] + \"-ov\")\n", |
179 | 307 | "\n", |
180 | 308 | "if not model_dir.exists():\n", |
181 | | - " optimum_cli(model_id, model_dir, additional_args={\"trust-remote-code\": \"\", \"weight-format\": \"fp16\", \"task\": \"image-text-to-text\"})\n", |
| 309 | + " optimum_cli(model_selector.value, model_dir, additional_args={\"trust-remote-code\": \"\", \"weight-format\": \"fp16\", \"task\": \"image-text-to-text\"})\n", |
182 | 310 | " compress_lm_weights(model_dir)" |
183 | 311 | ] |
184 | 312 | }, |
|
213 | 341 | }, |
214 | 342 | { |
215 | 343 | "cell_type": "code", |
216 | | - "execution_count": 1, |
| 344 | + "execution_count": null, |
217 | 345 | "id": "626fef57", |
218 | 346 | "metadata": {}, |
219 | | - "outputs": [ |
220 | | - { |
221 | | - "data": { |
222 | | - "application/vnd.jupyter.widget-view+json": { |
223 | | - "model_id": "2362638a795340e6b3effb0805848768", |
224 | | - "version_major": 2, |
225 | | - "version_minor": 0 |
226 | | - }, |
227 | | - "text/plain": [ |
228 | | - "Dropdown(description='Device:', index=1, options=('CPU', 'AUTO'), value='AUTO')" |
229 | | - ] |
230 | | - }, |
231 | | - "execution_count": 1, |
232 | | - "metadata": {}, |
233 | | - "output_type": "execute_result" |
234 | | - } |
235 | | - ], |
| 347 | + "outputs": [], |
236 | 348 | "source": [ |
237 | 349 | "from notebook_utils import device_widget\n", |
238 | 350 | "\n", |
|
243 | 355 | }, |
244 | 356 | { |
245 | 357 | "cell_type": "code", |
246 | | - "execution_count": 5, |
| 358 | + "execution_count": null, |
247 | 359 | "id": "e7af404b", |
248 | 360 | "metadata": {}, |
249 | 361 | "outputs": [], |
|
394 | 506 | "source": [ |
395 | 507 | "from gradio_helper import make_demo\n", |
396 | 508 | "\n", |
397 | | - "demo = make_demo(ov_model)\n", |
| 509 | + "demo = make_demo(ov_model, model_selector.value.split(\"/\")[-1])\n", |
398 | 510 | "\n", |
399 | 511 | "try:\n", |
400 | 512 | " demo.launch(debug=True, height=600)\n", |
|
422 | 534 | "name": "python", |
423 | 535 | "nbconvert_exporter": "python", |
424 | 536 | "pygments_lexer": "ipython3", |
425 | | - "version": "3.11.4" |
| 537 | + "version": "3.10.12" |
426 | 538 | }, |
427 | 539 | "openvino_notebooks": { |
428 | 540 | "imageUrl": "https://github.com/openvinotoolkit/openvino_notebooks/assets/29454499/7b0919ea-6fe4-4c8f-8395-cb0ee6e87937", |
|
0 commit comments