[OpenVINO backend] Adding support for OpenVINO backend & support inference for Mistral & Gemma & GPT2 #2350

Mohamed-Ashraf273 · 2025-08-02T19:09:34Z

Description of the change

As a part of my GSoC25 project to support inference with the openvino backend for Gemma , Mistral and GPT-2,
This is my PR for supporting Gemma , Mistral and GPT-2 pipelines.

import os
os.environ["KERAS_BACKEND"] = "openvino"
import keras_hub

model = keras_hub.models.GPT2CausalLM.from_preset(
    "gpt2_large_en", dtype="float32"
)
model.summary()
output = model.generate("Keras is ", max_length=20)
print("Generated text:", output)

Reference

https://docs.openvino.ai/2025/index.html
https://keras.io/api/
https://keras.io/keras_hub/

Colab Notebook

Checklist

I have added all the necessary unit tests for my change.
I have verified that my change does not break existing code and works with all backends (TensorFlow, JAX, and PyTorch).
My PR is based on the latest changes of the main branch (if unsure, rebase the code).
I have followed the Keras Hub Model contribution guidelines in making these changes.
I have followed the Keras Hub API design guidelines in making these changes.
I have signed the Contributor License Agreement.

Mohamed-Ashraf273 · 2025-08-02T19:09:43Z

@fchollet
@mattdangerw
@divyashreepathihalli
I'd appreciate any help reviewing this PR so we can get it merged. Thanks in advance!

gemini-code-assist

Summary of Changes

Hello @Mohamed-Ashraf273, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

I've implemented support for running inference with Gemma, Mistral, and GPT-2 models on the OpenVINO backend. This work is part of my GSoC25 project and aims to broaden the compatibility of Keras Hub models with various hardware platforms. The changes involve integrating OpenVINO's inference capabilities directly into the CausalLM models, introducing new utility functions for OpenVINO-specific operations, and updating the testing infrastructure to properly handle the OpenVINO backend's characteristics, including skipping tests that involve training or have known incompatibilities.

Highlights

OpenVINO Backend Integration: I've added comprehensive support for running inference with Gemma, Mistral, and GPT-2 models using the OpenVINO backend. This significantly expands the deployment options for these models within Keras Hub.
New OpenVINO Utility Module: To facilitate OpenVINO integration, I've introduced a new utility module, openvino_utils.py, which handles critical aspects like input parameterization, structured output processing, and intelligent model reusability checks for optimized performance.
CausalLM Model Adaptations for OpenVINO: I've updated the core CausalLM model's make_generate_function to dynamically switch to OpenVINO-optimized inference when the OpenVINO backend is active. This includes handling specific ops.slice and ops.reshape behaviors required by OpenVINO for certain models like Gemma and GPT-2.
Enhanced Test Skipping and Compatibility for OpenVINO: To ensure test suite compatibility and efficiency with the OpenVINO backend, I've implemented a new requires_trainable_backend pytest marker. This allows tests that rely on training functionalities (not yet supported by OpenVINO) to be skipped appropriately. Additionally, I've introduced exclusion lists (openvino_excluded_concrete_tests.txt and openvino_excluded_tests.txt) to manage specific test skips for OpenVINO.
Sampler Logic Adjustments for OpenVINO: The sampler.py module has been modified to ensure its ops.while_loop constructs are compatible with OpenVINO's requirements, specifically by explicitly passing the mask parameter in the loop's condition and body functions.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point in your pull request via creating an issue comment (i.e. comment on the pull request page) using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in issue comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments or fill out our survey to provide feedback.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

gemini-code-assist

Code Review

This pull request adds support for OpenVINO backend inference for several models, which is a significant enhancement. The changes include modifications to the testing infrastructure to handle backend-specific tests, updates to the core model generation logic, and the addition of new OpenVINO utility functions and tests. Overall, the implementation is solid, but I've identified a critical bug in the CausalLM class that could lead to an AttributeError, along with several opportunities for code simplification, efficiency improvements, and increased robustness. Please see my detailed comments below.

keras_hub/src/models/causal_lm.py

conftest.py

keras_hub/src/utils/openvino_utils.py

openvino_excluded_concrete_tests.txt

.github/workflows/actions.yml

Mohamed-Ashraf273 · 2025-08-02T19:40:26Z

/gemini review

gemini-code-assist

Code Review

This pull request introduces support for the OpenVINO backend for Gemma, Mistral, and GPT-2 models. The changes are extensive and well-structured, including:

Adding OpenVINO-specific logic in CausalLM for model compilation and inference.
Introducing a new openvino_utils.py module with helper functions and corresponding tests.
Updating conftest.py to selectively skip tests that are not compatible with the OpenVINO backend, using a new marker and an exclusion list.
Adding openvino to the project dependencies.

The implementation is solid, but I have a few suggestions to improve robustness and maintainability, such as using more specific exception handling, avoiding hardcoded device names for OpenVINO, and cleaning up test exclusion files. Overall, this is a great contribution to extend backend support.

keras_hub/src/models/causal_lm.py

conftest.py

keras_hub/src/models/causal_lm.py

openvino_excluded_concrete_tests.txt

openvino_excluded_tests.txt

… OpenVINO backend

Mohamed-Ashraf273 · 2025-08-03T13:33:22Z

/gemini review

gemini-code-assist

Code Review

This pull request introduces support for the OpenVINO backend for Gemma, Mistral, and GPT-2 models. The changes are comprehensive, touching the test configuration to skip non-compatible tests, adding core OpenVINO inference logic to CausalLM, and including necessary workarounds in model implementations. My review focuses on improving the correctness and maintainability of the new OpenVINO integration. I've identified a potential issue with input handling for the OpenVINO model, a recommendation to improve exception handling, and pointed out some opportunities for code cleanup.

keras_hub/src/models/causal_lm.py

conftest.py

keras_hub/src/models/causal_lm.py

openvino_excluded_tests.txt

keras_hub/src/models/causal_lm.py

Mohamed-Ashraf273 · 2025-08-03T22:13:14Z

@mattdangerw
@divyashreepathihalli
@fchollet
Could you please take a look?

Mohamed-Ashraf273 · 2025-08-19T16:42:43Z

@mattdangerw
@divyashreepathihalli
@hertschuh
@abheesht17
@fchollet

Mohamed-Ashraf273 · 2025-08-19T16:43:39Z

/gemini review

gemini-code-assist

Code Review

This pull request introduces support for the OpenVINO backend for inference with Gemma, Mistral, and GPT-2 models. The changes are comprehensive, including modifications to the test configuration to selectively skip unsupported tests, adjustments in the CausalLM and Sampler implementations for OpenVINO compatibility, and the addition of a new openvino_utils.py module with helper functions for inference and test management. I've found a couple of areas for improvement. There's a bug in how the test whitelist file is parsed, as it doesn't handle comments correctly. I've also suggested a refactoring in the TrainingMethodDetector class for better performance and code style. Overall, this is a great contribution to extend Keras Hub's backend support.

keras_hub/src/utils/openvino_utils.py

keras_hub/src/utils/openvino_utils_test.py

keras_hub/src/utils/openvino_utils.py

Mohamed-Ashraf273 · 2025-08-19T17:23:07Z

/gemini review

gemini-code-assist

Code Review

This pull request introduces support for the OpenVINO backend for inference with several models like Gemma, Mistral, and GPT-2. The changes are extensive, including a new OpenVINO utilities module with helpers for model compilation, inference, and test management. The test configuration is updated to intelligently skip tests that are not compatible with the OpenVINO backend. The core model generation logic is also adapted to use the OpenVINO inference path when the backend is active.

My review focuses on improving performance in the test suite and enhancing code quality in the new test files. I've identified an inefficiency in how OpenVINO-specific tests are configured and a code style issue regarding local imports in the new test file. Overall, this is a solid contribution that significantly expands the backend support of Keras Hub.

conftest.py

keras_hub/src/utils/openvino_utils_test.py

Mohamed-Ashraf273 · 2025-08-21T12:13:55Z

@mattdangerw
@divyashreepathihalli
@fchollet
@hertschuh
@abheesht17

Mohamed-Ashraf273 · 2025-08-24T14:27:23Z

@mattdangerw
@divyashreepathihalli
@fchollet
@hertschuh
@abheesht17

Mohamed-Ashraf273 · 2025-08-29T23:24:01Z

@mattdangerw
@divyashreepathihalli
@abheesht17
@hertschuh
@fchollet

rkazants · 2025-09-02T13:00:12Z

@mattdangerw
@divyashreepathihalli
@abheesht17
@hertschuh
@fchollet, please review

@rkazants

@rkazants @mvafin @mlukasze @evkotov @CuriousPanCake @itikhono ### Performance issue description ## Problem OpenVINO backend exhibits **excessive memory consumption** during GPT-2 model inference compared to other Keras backends (TensorFlow, PyTorch, JAX). The issue occurs during the model compilation phase when converting from Keras to OpenVINO format, resulting in significantly higher memory usage that makes OpenVINO unsuitable for memory-constrained environments. **Problem**: OpenVINO uses substantially more memory than other backends during the compilation/inference phase. ## Summary of the solution: Solving Issue: #31390, First I was trying to solve this problem by introducing an `EinsumDecomposition` at MOC in this PR: #31482 But I found another solution: My first fix was to add `EinsumDecomposition` in MOC, and I found that both this version and the original `EinsumDecomposition` in `CommonOptimizations` introduced `Broadcast` nodes. However, in my fix the MOC pipeline later removed them, which allowed constants to be shared before the `ConstantFolding` pass that otherwise duplicates them in `CommonOptimizations`, leading to reduced memory usage. By comparing the two, I realized that both decompositions actually produced the same graph initially, but the MOC version benefited from an additional simplification step that cleaned up the broadcasts. After debugging, I identified the responsible pass as `NopElimination`. When I applied this pass in `CommonOptimizations` just before `ConstantFolding`, it achieved the same effect: broadcasts disappeared, constants were shared, and memory usage dropped, without needing to move EinsumDecomposition into MOC. ### 📊 Complete Analysis & Benchmarks For comprehensive performance comparison, optimization results, and technical details across all Keras backends: **[� Detailed Performance Report & Memory Optimization Analysis](https://gist.github.com/Mohamed-Ashraf273/1ecc15bd5e83c229d7e3f07851624bc8)** The report includes cross-backend benchmarks before and after both fixes, which gave the same results for OpenVINO --- ### Step-by-step reproduction Use keras source: https://github.com/keras-team/keras.git Also use this PR from keras_hub: keras-team/keras-hub#2350 ```python import os os.environ["KERAS_BACKEND"] = "openvino" import keras_hub causal_lm = keras_hub.models.GPT2CausalLM.from_preset("gpt2_medium_en", dtype="float32") output = causal_lm.generate("Hello", max_length=10) # Memory spike occurs here ``` Example Graph: ```python def create_einsum_constant_model(): """Create a model with both constant and non-constant einsum patterns from different sources""" input_tensor = ops.parameter([1, 10, 1024], np.float32, name="input") # Create diverse constant sources for einsum operations # Source 1: Direct constant weight matrix weight_data_1 = np.random.randn(1024, 16, 64).astype(np.float32) const_weight_1 = ops.constant(weight_data_1, name="const_weight_1") # Source 2: Constant from addition base_weight_2 = ops.constant(np.random.randn(1024, 16, 64).astype(np.float32), name="base_weight_2") bias_weight_2 = ops.constant(np.random.randn(1024, 16, 64).astype(np.float32), name="bias_weight_2") const_weight_2 = ops.add(base_weight_2, bias_weight_2) # Constant folded # Source 3: Constant from multiply (your original source) base_weight_3 = ops.constant(np.random.randn(1024, 16, 64).astype(np.float32), name="base_weight_3") scale_3 = ops.constant(np.array(0.125, dtype=np.float32), name="scale_3") const_weight_3 = ops.multiply(base_weight_3, scale_3) # Constant folded # Source 4: Constant from reshape flat_weight_4 = ops.constant(np.random.randn(1024*16*64).astype(np.float32), name="flat_weight_4") const_weight_4 = ops.reshape(flat_weight_4, [1024, 16, 64], special_zero=False) # Source 5: Constant from transpose orig_weight_5 = ops.constant(np.random.randn(16, 1024, 64).astype(np.float32), name="orig_weight_5") const_weight_5 = ops.transpose(orig_weight_5, [1, 0, 2]) # [1024, 16, 64] current = input_tensor # Create 10 einsum operations with constants (WILL BE OPTIMIZED) const_sources = [const_weight_1, const_weight_2, const_weight_3, const_weight_4, const_weight_5] for i in range(5): # Use each constant source twice (5*2 = 10) for j in range(2): const_idx = i einsum_out = ops.einsum([current, const_sources[const_idx]], "abc,cde->abde") # Add bias to continue the chain bias = ops.constant(np.random.randn(16, 64).astype(np.float32), name=f"bias_{i}_{j}") current = ops.add(einsum_out, bias) # Reshape to prepare for next iteration if i < 4 or j < 1: # Not the last iteration proj_weight = ops.constant(np.random.randn(16*64, 1024).astype(np.float32), name=f"proj_{i}_{j}") reshaped = ops.reshape(current, [1, 10, 16*64], special_zero=False) current = ops.matmul(reshaped, proj_weight, transpose_a=False, transpose_b=False) # Now create variable tensors from different sources for non-constant einsums # Start fresh with current tensor for variable operations var_source = ops.reshape(current, [1, 10, 16, 64], special_zero=False) # Create 20 einsum operations without constants (WON'T BE OPTIMIZED) for i in range(10): # Source 1: Split operations to create variable tensors split_axis = ops.constant(np.array(3, dtype=np.int32), name=f"split_axis_{i}") split_lengths = ops.constant(np.array([32, 32], dtype=np.int32), name=f"split_lengths_{i}") split_result = ops.variadic_split(var_source, split_axis, split_lengths) var_tensor_1 = split_result.output(0) # [1, 10, 16, 32] - Variable var_tensor_2 = split_result.output(1) # [1, 10, 16, 32] - Variable # EINSUM 1: Element-wise pattern (variable x variable) einsum_var_1 = ops.einsum([var_tensor_1, var_tensor_2], "abcd,abcd->abcd") # Source 2: Create more variable tensors from different operations # Use subtract to create another variable tensor var_tensor_3 = ops.subtract(var_tensor_1, var_tensor_2) # [1, 10, 16, 32] - Variable # Use relu to create another variable tensor var_tensor_4 = ops.relu(var_tensor_2) # [1, 10, 16, 32] - Variable # EINSUM 2: Another variable x variable pattern einsum_var_2 = ops.einsum([var_tensor_3, var_tensor_4], "abcd,abcd->abcd") # Combine and use for next iteration combined = ops.add(einsum_var_1, einsum_var_2) # Concatenate back to [1, 10, 16, 64] for next iteration var_source = ops.concat([combined, combined], axis=3) # [1, 10, 16, 64] # Final projection to output final_proj = ops.constant(np.random.randn(16*64, 1024).astype(np.float32), name="final_proj") final_reshaped = ops.reshape(var_source, [1, 10, 16*64], special_zero=False) final_output = ops.matmul(final_reshaped, final_proj, transpose_a=False, transpose_b=False) # Final output model = ov.Model([final_output], [input_tensor], name="EinsumConstantTest") # Print model statistics ops_by_type = {} for op in model.get_ops(): op_type = op.get_type_name() ops_by_type[op_type] = ops_by_type.get(op_type, 0) + 1 print("Original model operations:") for op_type, count in sorted(ops_by_type.items()): print(f" {op_type}: {count}") print(f"\nEinsum breakdown:") print(f" - Einsums with constants (WILL BE OPTIMIZED): 10") print(f" * From direct constant: 2") print(f" * From constant addition: 2") print(f" * From constant multiply: 2") print(f" * From constant reshape: 2") print(f" * From constant transpose: 2") print(f" - Einsums without constants (WON'T BE OPTIMIZED): 20") print(f" * From variadic_split operations: 10") print(f" * From subtract + relu operations: 10") print(f" - Total Einsums: 30") return model ``` You can find the original IR, Complied IR, IR before NopElimination and after NopElimination here: https://drive.google.com/drive/folders/1xxNVFotGOZLeUf5ECtmJhm4fytJNoBLN?usp=sharing --- Original Graph: <img width="1130" height="918" alt="Screenshot from 2025-08-26 12-40-15" src="https://github.com/user-attachments/assets/37a93d33-4dd4-4b6b-9f83-1c21676e6551" /> Before NopElimination: <img width="655" height="919" alt="Screenshot from 2025-08-26 15-20-51" src="https://github.com/user-attachments/assets/45fe58dc-b702-4510-b30a-1cc15cc43acc" /> After NopElimination: <img width="655" height="919" alt="Screenshot from 2025-08-26 15-21-26" src="https://github.com/user-attachments/assets/1b7f19a6-45f8-4d60-b04d-bcd416749267" /> --------- Co-authored-by: Maxim Vafin <[email protected]> Co-authored-by: Andrii Staikov <[email protected]> Co-authored-by: Roman Kazantsev <[email protected]>

mattdangerw

Thank you! And sorry for the delay here.

This is looking much better and much more maintainable for us. I still think the testing side of this is going to be a bit hard for us to maintain, so let's keep trying to simplify there (left comments on this below).

keras_hub/src/utils/openvino_utils.py

keras_hub/src/samplers/sampler.py

keras_hub/src/utils/openvino_utils_test.py

openvino_supported_tests.txt

keras_hub/src/utils/openvino_utils.py

Mohamed-Ashraf273 · 2025-09-03T00:47:52Z

@mattdangerw
@rkazants
Suggested changes added!

Mohamed-Ashraf273 · 2025-09-03T10:55:50Z

@mattdangerw
Jax's failing test is not caused by my PR.
Could you please review?

mattdangerw

Thank you! Think this is almost ready, just a last few simplifications.

keras_hub/src/utils/openvino_utils.py

conftest.py

Mohamed-Ashraf273 · 2025-09-05T21:03:46Z

@mattdangerw
Suggested changes added!

github-actions bot added the Gemma Gemma model specific issues label Aug 2, 2025

Mohamed-Ashraf273 changed the title ~~[OpenVINO backend] support inference for Mistral & Gemma & GPT2 using…~~ [OpenVINO backend] support inference for Mistral & Gemma & GPT2 using OpenVINO backend Aug 2, 2025

gemini-code-assist bot reviewed Aug 2, 2025

View reviewed changes

Mohamed-Ashraf273 commented Aug 2, 2025

View reviewed changes

.github/workflows/actions.yml Outdated Show resolved Hide resolved

Mohamed-Ashraf273 force-pushed the supporting_gemma_inference_with_ov_backend branch from 458c345 to 8eda2ab Compare August 2, 2025 19:38

gemini-code-assist bot reviewed Aug 2, 2025

View reviewed changes

[OpenVINO backend] support inference for Mistral & Gemma & GPT2 using…

ab31c72

… OpenVINO backend

Mohamed-Ashraf273 force-pushed the supporting_gemma_inference_with_ov_backend branch from 7875897 to ab31c72 Compare August 2, 2025 19:57

Mohamed-Ashraf273 added 2 commits August 3, 2025 14:36

enable test_cache test

9305b05

update conftest

91b478f

Mohamed-Ashraf273 force-pushed the supporting_gemma_inference_with_ov_backend branch from 9657b98 to 91b478f Compare August 3, 2025 11:55

Mohamed-Ashraf273 added 2 commits August 3, 2025 14:58

update causal.lm

23e62e2

remove openvino_utils and handle device

c9291e0

gemini-code-assist bot reviewed Aug 3, 2025

View reviewed changes

keras_hub/src/models/causal_lm.py Outdated Show resolved Hide resolved

conftest.py Outdated Show resolved Hide resolved

keras_hub/src/models/causal_lm.py Outdated Show resolved Hide resolved

openvino_excluded_tests.txt Outdated Show resolved Hide resolved

fix typo

af2ae33

Mohamed-Ashraf273 commented Aug 3, 2025

View reviewed changes

keras_hub/src/models/causal_lm.py Show resolved Hide resolved

Mohamed-Ashraf273 changed the title ~~[OpenVINO backend] support inference for Mistral & Gemma & GPT2 using OpenVINO backend~~ [OpenVINO backend] Adding support for OpenVINO backend & support inference for Mistral & Gemma & GPT2 Aug 3, 2025

remove unnecessary check

792273e

Mohamed-Ashraf273 force-pushed the supporting_gemma_inference_with_ov_backend branch 5 times, most recently from bc0afe5 to fa11864 Compare August 4, 2025 12:31

update causal.lm

8baea81

Mohamed-Ashraf273 force-pushed the supporting_gemma_inference_with_ov_backend branch from fa11864 to 8baea81 Compare August 4, 2025 12:31

Mohamed-Ashraf273 force-pushed the supporting_gemma_inference_with_ov_backend branch from 1b742c5 to 110b9c3 Compare August 19, 2025 14:07

gemini-code-assist bot reviewed Aug 19, 2025

View reviewed changes

keras_hub/src/utils/openvino_utils.py Outdated Show resolved Hide resolved

keras_hub/src/utils/openvino_utils_test.py Outdated Show resolved Hide resolved

keras_hub/src/utils/openvino_utils.py Outdated Show resolved Hide resolved

Mohamed-Ashraf273 force-pushed the supporting_gemma_inference_with_ov_backend branch from ef7b03d to 3ed2fea Compare August 19, 2025 17:06

gemini-code-assist bot reviewed Aug 19, 2025

View reviewed changes

conftest.py Outdated Show resolved Hide resolved

keras_hub/src/utils/openvino_utils_test.py Outdated Show resolved Hide resolved

update PR

9bac18f

Mohamed-Ashraf273 force-pushed the supporting_gemma_inference_with_ov_backend branch from 3ed2fea to 9bac18f Compare August 19, 2025 17:40

Mohamed-Ashraf273 requested a review from mattdangerw August 25, 2025 15:24

Mohamed-Ashraf273 mentioned this pull request Aug 26, 2025

Fix memory issue while compiling keras models openvinotoolkit/openvino#31873

Merged

mattdangerw reviewed Sep 2, 2025

View reviewed changes

mattdangerw added the kokoro:force-run Runs Tests on GPU label Sep 2, 2025

kokoro-team removed the kokoro:force-run Runs Tests on GPU label Sep 2, 2025

Mohamed-Ashraf273 force-pushed the supporting_gemma_inference_with_ov_backend branch 2 times, most recently from f213efc to 3277b7b Compare September 3, 2025 00:49

add suggested updates

06a2a8e

Mohamed-Ashraf273 force-pushed the supporting_gemma_inference_with_ov_backend branch from 3277b7b to 06a2a8e Compare September 3, 2025 00:57

mattdangerw reviewed Sep 5, 2025

View reviewed changes

keras_hub/src/utils/openvino_utils.py Outdated Show resolved Hide resolved

conftest.py Outdated Show resolved Hide resolved

conftest.py Outdated Show resolved Hide resolved

conftest.py Outdated Show resolved Hide resolved

conftest.py Outdated Show resolved Hide resolved

Mohamed-Ashraf273 force-pushed the supporting_gemma_inference_with_ov_backend branch from 875ef0e to 3c425f2 Compare September 6, 2025 11:22

update conftest.py & openvino utils

9e54481

Mohamed-Ashraf273 force-pushed the supporting_gemma_inference_with_ov_backend branch from 3c425f2 to 9e54481 Compare September 6, 2025 11:49

[OpenVINO backend] Adding support for OpenVINO backend & support inference for Mistral & Gemma & GPT2 #2350

Are you sure you want to change the base?

[OpenVINO backend] Adding support for OpenVINO backend & support inference for Mistral & Gemma & GPT2 #2350

Conversation

Mohamed-Ashraf273 commented Aug 2, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description of the change

Reference

Colab Notebook

Checklist

Uh oh!

Mohamed-Ashraf273 commented Aug 2, 2025

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Summary of Changes

Highlights

Footnotes

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Mohamed-Ashraf273 commented Aug 2, 2025

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Mohamed-Ashraf273 commented Aug 3, 2025

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Mohamed-Ashraf273 commented Aug 3, 2025

Uh oh!

Mohamed-Ashraf273 commented Aug 19, 2025

Uh oh!

Mohamed-Ashraf273 commented Aug 19, 2025

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Mohamed-Ashraf273 commented Aug 19, 2025

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

Uh oh!

Mohamed-Ashraf273 commented Aug 21, 2025

Uh oh!

Mohamed-Ashraf273 commented Aug 24, 2025

Uh oh!

Mohamed-Ashraf273 commented Aug 29, 2025

Uh oh!

rkazants commented Sep 2, 2025

Mohamed-Ashraf273 commented Aug 2, 2025 •

edited

Loading

mattdangerw left a comment •

edited

Loading

Mohamed-Ashraf273 commented Sep 3, 2025 •

edited

Loading