Skip to content

[OpenVINO backend] supporting inference for Gemma, Mistral and GPT2 with ov backend #2310

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 10 commits into
base: master
Choose a base branch
from

Conversation

Mohamed-Ashraf273
Copy link

@Mohamed-Ashraf273 Mohamed-Ashraf273 commented Jun 22, 2025

Description of the change

As a part of my GSoC25 project to support inference with the openvino backend for Gemma , Mistral and GPT-2,
This is my PR for supporting Gemma , Mistral and GPT-2 pipelines.

import os
os.environ["KERAS_BACKEND"] = "openvino"
import keras_hub

model = keras_hub.models.GPT2CausalLM.from_preset(
    "gpt2_large_en", dtype="float16"
)
model.summary()
output = model.generate("Keras is ", max_length=20)
print("Generated text:", output)

Reference

https://docs.openvino.ai/2025/index.html
https://keras.io/api/
https://keras.io/keras_hub/

Colab Notebook

Checklist

  • I have added all the necessary unit tests for my change.
  • I have verified that my change does not break existing code and works with all backends (TensorFlow, JAX, and PyTorch).
  • My PR is based on the latest changes of the main branch (if unsure, rebase the code).
  • I have followed the Keras Hub Model contribution guidelines in making these changes.
  • I have followed the Keras Hub API design guidelines in making these changes.
  • I have signed the Contributor License Agreement.

@github-actions github-actions bot added the Gemma Gemma model specific issues label Jun 22, 2025
@Mohamed-Ashraf273 Mohamed-Ashraf273 force-pushed the supporting_gemma_inference_with_ov_backend branch 26 times, most recently from 6576b03 to 074f0c2 Compare June 23, 2025 17:26
@Mohamed-Ashraf273 Mohamed-Ashraf273 force-pushed the supporting_gemma_inference_with_ov_backend branch 2 times, most recently from d748dd5 to f5470cd Compare June 24, 2025 13:36
@Mohamed-Ashraf273 Mohamed-Ashraf273 force-pushed the supporting_gemma_inference_with_ov_backend branch 9 times, most recently from 776b462 to a910089 Compare July 8, 2025 22:26
@Mohamed-Ashraf273 Mohamed-Ashraf273 force-pushed the supporting_gemma_inference_with_ov_backend branch 2 times, most recently from 5d24fb4 to a986b45 Compare July 9, 2025 11:03
@Mohamed-Ashraf273
Copy link
Author

Hi @fchollet ,
I'd appreciate any feedback on my PR.
thanks

@@ -48,6 +51,10 @@ jobs:
run: |
pip install -r requirements.txt --progress-bar off
pip install --no-deps -e "." --progress-bar off
if [[ "${{ matrix.backend }}" == "openvino" ]]; then
pip uninstall -y keras
pip install git+https://github.com/Mohamed-Ashraf273/keras.git@gsoc2025 --upgrade --force-reinstall --progress-bar off
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this should be removed for final merge

import numpy as np
import openvino as ov
import openvino.runtime.opset14 as ov_opset
from keras.src.backend.openvino.core import OPENVINO_DTYPES
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Keras does not guarantee the stability of its src layout. A future refactoring in Keras could easily break this code. The goal should be to rely only on public Keras APIs.

import numpy as np
import openvino as ov
import openvino.runtime.opset14 as ov_opset
from keras.src.backend.openvino.core import OPENVINO_DTYPES
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe move the nested helper functions out of make_generate_function. They can be defined at the module level (e.g., in a keras_hub.src.backend.openvino_utils file if necessary) to make the core make_generate_function logic cleaner and more readable.

@Mohamed-Ashraf273 Mohamed-Ashraf273 force-pushed the supporting_gemma_inference_with_ov_backend branch 2 times, most recently from afa806c to 174b78b Compare July 10, 2025 10:31
@Mohamed-Ashraf273 Mohamed-Ashraf273 force-pushed the supporting_gemma_inference_with_ov_backend branch from 174b78b to b41265a Compare July 10, 2025 11:24
@divyashreepathihalli
Copy link
Collaborator

/gemini review

Copy link

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces support for the OpenVINO backend for several models (Gemma, Mistral, GPT-2). The changes are extensive and well-structured, including adding OpenVINO as a dependency, implementing a robust test-skipping mechanism, and adding OpenVINO-specific code paths to handle backend differences. The new openvino_utils.py module is a great way to encapsulate backend-specific logic.

My review focuses on improving robustness and configurability. I've identified a few issues:

  • A hardcoded "CPU" device for OpenVINO compilation, which limits hardware acceleration options.
  • A minor inefficiency in the test configuration file.
  • Potential data issues in the test exclusion lists (trailing whitespace, missing newline).

Overall, this is a great contribution. The changes are thoughtful and address the complexities of adding a new backend.

NonMaxSupressionTest::test_confidence_threshold
NonMaxSupressionTest::test_max_detections
RandomSamplerTest::test_early_stopping
RandomSamplerTest::test_stateful_call

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

This line has a trailing whitespace. This could cause issues with string comparisons in the test skipping logic, potentially preventing this test from being skipped as intended. Please remove the trailing space.

RandomSamplerTest::test_stateful_call

Comment on lines +125 to +130
openvino_skipped_tests = file.readlines()
# it is necessary to check if stripped line is not empty
# and exclude such lines
openvino_skipped_tests = [
line.strip() for line in openvino_skipped_tests if line.strip()
]

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

This logic for reading and processing the skipped tests file can be made more concise and memory-efficient by iterating over the file object directly instead of using readlines().

Suggested change
openvino_skipped_tests = file.readlines()
# it is necessary to check if stripped line is not empty
# and exclude such lines
openvino_skipped_tests = [
line.strip() for line in openvino_skipped_tests if line.strip()
]
openvino_skipped_tests = [
line.strip() for line in file if line.strip()
]


ov_model = ov.Model(results=results, parameters=parameters)
ov_infer._compiled_models[model_hash] = ov.compile_model(
ov_model, "CPU"

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The OpenVINO compilation device is hardcoded to "CPU". This prevents users from leveraging other OpenVINO-supported hardware accelerators like iGPUs or NPUs. To improve flexibility, consider making the device configurable, for example, by reading from a keras.config setting or a dedicated environment variable.

                    ov_model, keras.config.backend_device()

keras_hub/src/samplers/greedy_sampler_test.py
keras_hub/src/samplers/top_k_sampler_test.py
keras_hub/src/samplers/top_p_sampler_test.py
keras_hub/src/utils/pipeline_model_test.py

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

This file is missing a newline character at the end. It's a best practice to end text files with a newline to ensure they are processed correctly by various command-line tools.

keras_hub/src/utils/pipeline_model_test.py

@Mohamed-Ashraf273
Copy link
Author

@Mohamed-Ashraf273 Mohamed-Ashraf273 force-pushed the supporting_gemma_inference_with_ov_backend branch from c36b8e5 to 3d6a09b Compare July 15, 2025 20:44
@Mohamed-Ashraf273 Mohamed-Ashraf273 force-pushed the supporting_gemma_inference_with_ov_backend branch from cbd322e to e59d313 Compare July 16, 2025 11:58
@Mohamed-Ashraf273 Mohamed-Ashraf273 force-pushed the supporting_gemma_inference_with_ov_backend branch from e59d313 to 41f9a0f Compare July 16, 2025 12:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Gemma Gemma model specific issues
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants