-
Notifications
You must be signed in to change notification settings - Fork 289
[OpenVINO backend] supporting inference for Gemma, Mistral and GPT2 with ov backend #2310
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
[OpenVINO backend] supporting inference for Gemma, Mistral and GPT2 with ov backend #2310
Conversation
6576b03
to
074f0c2
Compare
d748dd5
to
f5470cd
Compare
776b462
to
a910089
Compare
5d24fb4
to
a986b45
Compare
Hi @fchollet , |
@@ -48,6 +51,10 @@ jobs: | |||
run: | | |||
pip install -r requirements.txt --progress-bar off | |||
pip install --no-deps -e "." --progress-bar off | |||
if [[ "${{ matrix.backend }}" == "openvino" ]]; then | |||
pip uninstall -y keras | |||
pip install git+https://github.com/Mohamed-Ashraf273/keras.git@gsoc2025 --upgrade --force-reinstall --progress-bar off |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this should be removed for final merge
keras_hub/src/models/causal_lm.py
Outdated
import numpy as np | ||
import openvino as ov | ||
import openvino.runtime.opset14 as ov_opset | ||
from keras.src.backend.openvino.core import OPENVINO_DTYPES |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Keras does not guarantee the stability of its src layout. A future refactoring in Keras could easily break this code. The goal should be to rely only on public Keras APIs.
keras_hub/src/models/causal_lm.py
Outdated
import numpy as np | ||
import openvino as ov | ||
import openvino.runtime.opset14 as ov_opset | ||
from keras.src.backend.openvino.core import OPENVINO_DTYPES |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe move the nested helper functions out of make_generate_function. They can be defined at the module level (e.g., in a keras_hub.src.backend.openvino_utils file if necessary) to make the core make_generate_function logic cleaner and more readable.
afa806c
to
174b78b
Compare
174b78b
to
b41265a
Compare
/gemini review |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request introduces support for the OpenVINO backend for several models (Gemma, Mistral, GPT-2). The changes are extensive and well-structured, including adding OpenVINO as a dependency, implementing a robust test-skipping mechanism, and adding OpenVINO-specific code paths to handle backend differences. The new openvino_utils.py
module is a great way to encapsulate backend-specific logic.
My review focuses on improving robustness and configurability. I've identified a few issues:
- A hardcoded "CPU" device for OpenVINO compilation, which limits hardware acceleration options.
- A minor inefficiency in the test configuration file.
- Potential data issues in the test exclusion lists (trailing whitespace, missing newline).
Overall, this is a great contribution. The changes are thoughtful and address the complexities of adding a new backend.
NonMaxSupressionTest::test_confidence_threshold | ||
NonMaxSupressionTest::test_max_detections | ||
RandomSamplerTest::test_early_stopping | ||
RandomSamplerTest::test_stateful_call |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
openvino_skipped_tests = file.readlines() | ||
# it is necessary to check if stripped line is not empty | ||
# and exclude such lines | ||
openvino_skipped_tests = [ | ||
line.strip() for line in openvino_skipped_tests if line.strip() | ||
] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This logic for reading and processing the skipped tests file can be made more concise and memory-efficient by iterating over the file object directly instead of using readlines()
.
openvino_skipped_tests = file.readlines() | |
# it is necessary to check if stripped line is not empty | |
# and exclude such lines | |
openvino_skipped_tests = [ | |
line.strip() for line in openvino_skipped_tests if line.strip() | |
] | |
openvino_skipped_tests = [ | |
line.strip() for line in file if line.strip() | |
] |
|
||
ov_model = ov.Model(results=results, parameters=parameters) | ||
ov_infer._compiled_models[model_hash] = ov.compile_model( | ||
ov_model, "CPU" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The OpenVINO compilation device is hardcoded to "CPU"
. This prevents users from leveraging other OpenVINO-supported hardware accelerators like iGPUs or NPUs. To improve flexibility, consider making the device configurable, for example, by reading from a keras.config
setting or a dedicated environment variable.
ov_model, keras.config.backend_device()
openvino_excluded_tests.txt
Outdated
keras_hub/src/samplers/greedy_sampler_test.py | ||
keras_hub/src/samplers/top_k_sampler_test.py | ||
keras_hub/src/samplers/top_p_sampler_test.py | ||
keras_hub/src/utils/pipeline_model_test.py |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
c36b8e5
to
3d6a09b
Compare
cbd322e
to
e59d313
Compare
e59d313
to
41f9a0f
Compare
Description of the change
As a part of my GSoC25 project to support inference with the openvino backend for
Gemma
,Mistral
andGPT-2
,This is my PR for supporting
Gemma
,Mistral
andGPT-2
pipelines.Reference
https://docs.openvino.ai/2025/index.html
https://keras.io/api/
https://keras.io/keras_hub/
Colab Notebook
Checklist