Skip to content

Commit 636f2b3

Browse files
authored
Merge branch 'master' into an/fix-android-jit-eltwise
2 parents fce68eb + cb5138e commit 636f2b3

File tree

64 files changed

+2316
-512
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

64 files changed

+2316
-512
lines changed

.github/dockerfiles/docker_tag

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1 +1 @@
1-
pr-32459
1+
pr-33047

.github/dockerfiles/ov_build/debian_10_arm/Dockerfile

Lines changed: 7 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -4,10 +4,16 @@ FROM ${REGISTRY}/library/debian:10.13
44
USER root
55

66
# APT configuration
7+
# WARNING: Debian 10 "Buster" is no longer officially supported,
8+
# so repositories were moved to "archive.debian.org"
9+
# See: https://www.debian.org/releases/
710
RUN echo 'Acquire::Retries "10";' > /etc/apt/apt.conf && \
811
echo 'APT::Get::Assume-Yes "true";' >> /etc/apt/apt.conf && \
912
echo 'APT::Get::Fix-Broken "true";' >> /etc/apt/apt.conf && \
10-
echo 'APT::Get::no-install-recommends "true";' >> /etc/apt/apt.conf
13+
echo 'APT::Get::no-install-recommends "true";' >> /etc/apt/apt.conf && \
14+
sed -i 's/deb.debian.org/archive.debian.org/g' /etc/apt/sources.list && \
15+
sed -i 's/security.debian.org/archive.debian.org/g' /etc/apt/sources.list && \
16+
sed -i '/stretch-updates/d' /etc/apt/sources.list
1117

1218
ENV DEBIAN_FRONTEND="noninteractive" \
1319
TZ="Europe/London"

.github/dockerfiles/ov_test/debian_10_arm/Dockerfile

Lines changed: 7 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -4,10 +4,16 @@ FROM ${REGISTRY}/library/debian:10.13
44
USER root
55

66
# APT configuration
7+
# WARNING: Debian 10 "Buster" is no longer officially supported,
8+
# so repositories were moved to "archive.debian.org"
9+
# See: https://www.debian.org/releases/
710
RUN echo 'Acquire::Retries "10";' > /etc/apt/apt.conf && \
811
echo 'APT::Get::Assume-Yes "true";' >> /etc/apt/apt.conf && \
912
echo 'APT::Get::Fix-Broken "true";' >> /etc/apt/apt.conf && \
10-
echo 'APT::Get::no-install-recommends "true";' >> /etc/apt/apt.conf
13+
echo 'APT::Get::no-install-recommends "true";' >> /etc/apt/apt.conf && \
14+
sed -i 's/deb.debian.org/archive.debian.org/g' /etc/apt/sources.list && \
15+
sed -i 's/security.debian.org/archive.debian.org/g' /etc/apt/sources.list && \
16+
sed -i '/stretch-updates/d' /etc/apt/sources.list
1117

1218
ENV DEBIAN_FRONTEND="noninteractive" \
1319
TZ="Europe/London"

.github/dockerfiles/ov_test/debian_10_py310/Dockerfile

Lines changed: 7 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -4,10 +4,16 @@ FROM ${REGISTRY}/library/debian:10.13
44
USER root
55

66
# APT configuration
7+
# WARNING: Debian 10 "Buster" is no longer officially supported,
8+
# so repositories were moved to "archive.debian.org"
9+
# See: https://www.debian.org/releases/
710
RUN echo 'Acquire::Retries "10";' > /etc/apt/apt.conf && \
811
echo 'APT::Get::Assume-Yes "true";' >> /etc/apt/apt.conf && \
912
echo 'APT::Get::Fix-Broken "true";' >> /etc/apt/apt.conf && \
10-
echo 'APT::Get::no-install-recommends "true";' >> /etc/apt/apt.conf
13+
echo 'APT::Get::no-install-recommends "true";' >> /etc/apt/apt.conf && \
14+
sed -i 's/deb.debian.org/archive.debian.org/g' /etc/apt/sources.list && \
15+
sed -i 's/security.debian.org/archive.debian.org/g' /etc/apt/sources.list && \
16+
sed -i '/stretch-updates/d' /etc/apt/sources.list
1117

1218
ENV DEBIAN_FRONTEND="noninteractive" \
1319
TZ="Europe/London"

docs/articles_en/about-openvino/performance-benchmarks.rst

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -158,7 +158,7 @@ For a listing of all platforms and configurations used for testing, refer to the
158158
**Disclaimers**
159159

160160
* System configurations used for Intel® Distribution of OpenVINO™ toolkit performance results
161-
are based on release 2025.3, as of September 3rd, 2025.
161+
are based on release 2025.4, as of December 1st, 2025.
162162

163163
* OpenVINO Model Server performance results are based on release 2025.3, as of September 3rd, 2025.
164164

docs/articles_en/about-openvino/performance-benchmarks/model-accuracy-int8-fp32.rst

Lines changed: 36 additions & 43 deletions
Original file line numberDiff line numberDiff line change
@@ -41,7 +41,7 @@ the table for more information.
4141
* - mobilenet-v2
4242
- ImageNet2012
4343
- accuracy @ top1
44-
- -0.93%
44+
- -0.91%
4545
- -0.93%
4646
- -0.91%
4747
- -1.03%
@@ -96,28 +96,28 @@ the table for more information.
9696
- 0.00%
9797
- 0.00%
9898
- 0.02%
99-
- 0.01%
99+
- 0.02%
100100
* - resnet-50
101101
- ImageNet2012
102102
- accuracy @ top1
103103
- 0.00%
104104
- 0.00%
105105
- 0.00%
106-
- -0.04%
106+
- -0.01%
107107
* - ssd-resnet34-1200
108108
- COCO2017_detection_80cl_bkgr
109109
- map
110110
- 0.02%
111111
- 0.02%
112112
- 0.02%
113-
- 0.06%
113+
- -0.23%
114114
* - yolo_v11
115115
- COCO2017_detection_80cl
116116
- [email protected]:0.05:0.95
117-
- 0.00%
118-
- 0.00%
119-
- 0.00%
120-
-
117+
- 0.03%
118+
- -2.21%
119+
- -2.21%
120+
- -2.21%
121121
.. list-table:: Model Accuracy for AMX-FP16, AMX-INT4, Arc-FP16 and Arc-INT4 (Arc™ B-series)
122122
:header-rows: 1
123123

@@ -134,69 +134,62 @@ the table for more information.
134134
- 98.1%
135135
- 94.4%
136136
- 99.5%
137-
- 92.6%
137+
- 94.0%
138138
* - DeepSeek-R1-Distill-Qwen-1.5B
139139
- Data Default WWB
140140
- Similarity
141141
- 96.5%
142142
- 92.4%
143143
- 99.7%
144-
- 92.1%
145-
* - Gemma-3-1B-it
144+
- 92.3%
145+
* - Gemma-3-4B-it
146146
- Data Default WWB
147147
- Similarity
148-
- 97.3%
149148
- 92.0%
150-
- 99.2%
151-
- 91.5%
152-
* - GLM4-9B-Chat
153-
- Data Default WWB
154-
- Similarity
155-
- 98.8%
156-
- 93.3%
157-
- %
158-
- 95.0%
149+
- 83.9%
150+
-
151+
- 84.9%
159152
* - Llama-2-7B-chat
160153
- Data Default WWB
161154
- Similarity
162155
- 99.3%
163156
- 93.4%
164157
- 99.8%
165-
- 91.9%
158+
- 93.4%
166159
* - Llama-3-8B
167160
- Data Default WWB
168161
- Similarity
169162
- 98.8%
170163
- 94.3%
171-
- %
164+
- 99.7%
172165
- 94.5%
173166
* - Llama-3.2-3b-instruct
174167
- Data Default WWB
175168
- Similarity
176-
- 98.2%
177-
- 93.2%
178-
- 98.4%
179-
- 94.0%
180-
* - Mistral-7b-instruct-V0.3
181-
- Data Default WWB
182-
- Similarity
183-
- 98.3%
184-
- 92.8%
185-
- 99.9%
186-
- 93.6%
169+
- 97.9%
170+
- 94.2%
171+
- 99.7%
172+
- 94.1%
187173
* - Phi4-mini-instruct
188174
- Data Default WWB
189175
- Similarity
190-
- 96.4%
191-
- 92.0%
192-
- 99.3%
193-
- 91.7%
176+
- 89.1%
177+
- 92.1%
178+
- 99.5%
179+
- 92.4%
194180
* - Qwen2-VL-7B
195181
- Data Default WWB
196182
- Similarity
197-
- 97.8%
198-
- 92.4%
183+
- 97.5%
184+
- 88.1%
199185
- 99.8%
186+
- 91.4%
187+
* - Qwen3-8B
188+
- Data Default WWB
189+
- Similarity
190+
- 97.8%
191+
- 92.3%
192+
-
200193
- 93.0%
201194
* - Flux.1-schnell
202195
- Data Default WWB
@@ -208,10 +201,10 @@ the table for more information.
208201
* - Stable-Diffusion-V1-5
209202
- Data Default WWB
210203
- Similarity
211-
- 97.3%
212-
- 95.1%
204+
- 96.3%
205+
- 93.3%
213206
- 99.5%
214-
- 91.5%
207+
- 93.7%
215208

216209
Notes: For all accuracy metrics a "-", (minus sign), indicates an accuracy drop.
217210
The Similarity metric is the distance from "perfect" and as such always positive.

docs/articles_en/about-openvino/performance-benchmarks/performance-benchmarks-faq.rst

Lines changed: 1 addition & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -55,11 +55,7 @@ Performance Information F.A.Q.
5555
- DeepSeek, HF
5656
- Auto regressive language
5757
- 128K
58-
* - `GLM4-9B-chat <https://huggingface.co/THUDM/glm-4-9b-chat/tree/main>`__
59-
- THUDM
60-
- Transformer
61-
- 128K
62-
* - `Gemma-3-1B-it <https://huggingface.co/google/gemma-3-1b-it>`__
58+
* - `Gemma-3-4B-it <https://huggingface.co/google/gemma-3-4b-it>`__
6359
- Hugginface
6460
- Text-To-Text Decoder-only
6561
- 128K
@@ -75,14 +71,6 @@ Performance Information F.A.Q.
7571
- Meta AI
7672
- Auto regressive language
7773
- 128K
78-
* - `Mistral-7b-Instruct-V0.3 <https://huggingface.co/mistralai/Mistral-7B-v0.3>`__
79-
- Mistral AI
80-
- Auto regressive language
81-
- 32K
82-
* - `Phi3-4k-mini-Instruct <https://huggingface.co/microsoft/Phi-3-mini-4k-instruct>`__
83-
- Huggingface
84-
- Auto regressive language
85-
- 4096
8674
* - `Phi4-mini-Instruct <https://huggingface.co/microsoft/Phi-4-mini-instruct>`__
8775
- Huggingface
8876
- Auto regressive language

docs/articles_en/assets/snippets/npu_remote_objects_creation.cpp

Lines changed: 12 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -44,14 +44,24 @@ int main() {
4444
{
4545
//! [wrap_nt_handle]
4646
void* shared_buffer = nullptr;
47-
auto remote_tensor = npu_context.create_tensor(in_element_type, in_shape, shared_buffer);
47+
ov::intel_npu::MemType memory_type = ov::intel_npu::MemType::SHARED_BUF;
48+
auto remote_tensor = npu_context.create_tensor(in_element_type, in_shape, shared_buffer, memory_type);
4849
//! [wrap_nt_handle]
4950
}
5051

52+
{
53+
//! [import_cpu_va]
54+
void* standard_allocation = nullptr;
55+
ov::intel_npu::MemType memory_type = ov::intel_npu::MemType::CPU_VA;
56+
auto remote_tensor = npu_context.create_tensor(in_element_type, in_shape, standard_allocation, memory_type);
57+
//! [import_cpu_va]
58+
}
59+
5160
{
5261
//! [wrap_dmabuf_fd]
5362
int32_t fd_heap = 0; // create the DMA-BUF System Heap file descriptor
54-
auto remote_tensor = npu_context.create_tensor(in_element_type, in_shape, fd_heap);
63+
ov::intel_npu::MemType memory_type = ov::intel_npu::MemType::SHARED_BUF;
64+
auto remote_tensor = npu_context.create_tensor(in_element_type, in_shape, fd_heap, memory_type);
5565
//! [wrap_dmabuf_fd]
5666
}
5767

docs/articles_en/documentation/compatibility-and-support/supported-operations.rst

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -218,6 +218,7 @@ Data as of OpenVINO 2024.4, 18 Oct. 2024.
218218
ScatterElements
219219
ScatterND
220220
Selu
221+
SequenceAt
221222
Shape
222223
Shrink
223224
Sigmoid
@@ -231,6 +232,7 @@ Data as of OpenVINO 2024.4, 18 Oct. 2024.
231232
Softsign
232233
SpaceToDepth
233234
Split
235+
SplitToSequence
234236
Sqrt
235237
Squeeze
236238
STFT

docs/articles_en/openvino-workflow/running-inference/inference-devices-and-modes/npu-device/remote-tensor-api-npu-plugin.rst

Lines changed: 27 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -12,8 +12,8 @@ The NPU plugin supports memory sharing between OpenVINO and native APIs such as
1212
It implements the ``ov::RemoteContext`` and ``ov::RemoteTensor`` interfaces, providing mechanisms for efficient memory sharing.
1313
On Windows, the plugin exports an NT handle; on Linux, it uses a DMA-BUF System Heap. You can share this memory by
1414
passing the pointer as the ``shared_buffer`` member to the ``remote_tensor(..., shared_buffer)`` create function.
15-
Another option is to share memory by mapping a file into memory. These methods help avoid memory copy overhead when
16-
plugging OpenVINO inference into an existing NPU pipeline.
15+
Another option is to import memory by mapping a file into memory or by using a CPU virtual address allocation. These methods
16+
help avoid memory copy overhead when plugging OpenVINO inference into an existing NPU pipeline.
1717

1818
Supported scenario by the Remote Tensor API:
1919

@@ -81,8 +81,15 @@ For more details, see the code snippets below:
8181
:language: cpp
8282
:fragment: [file_mapping]
8383

84+
.. tab-item:: Import CPU virtual address allocation
85+
:sync: import-cpu-va
86+
87+
.. doxygensnippet:: docs/articles_en/assets/snippets/npu_remote_objects_creation.cpp
88+
:language: cpp
89+
:fragment: [import_cpu_va]
90+
8491
.. tab-item:: NT handle
85-
:sync: nthandle
92+
:sync: nt-handle
8693

8794
.. doxygensnippet:: docs/articles_en/assets/snippets/npu_remote_objects_creation.cpp
8895
:language: cpp
@@ -118,7 +125,23 @@ For more details, see the code snippets below:
118125
Limitations
119126
+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
120127

121-
* Allocation of the NT handle or DMA-BUF System Heap file descriptor is done manually.
128+
The NPU plugin does not support methods for direct allocation of native handles.
129+
130+
.. warning::
131+
132+
**CPU Virtual Address Allocation Requirements**
133+
When using CPU virtual address allocations, you **must** comply with the following requirements to prevent memory corruption and crashes:
134+
135+
**1. Memory Alignment (Mandatory)**
136+
Both the allocation pointer and its size must be aligned to the standard page size (4KB). Non-aligned allocations will be rejected.
137+
138+
**2. Allocation Lifetime (Critical)**
139+
The allocation must remain valid **until ALL** of the following have occurred:
140+
* All inference requests using this remote tensor have completed execution, **AND**
141+
* All inference requests using this remote tensor have been destroyed, **AND**
142+
* The remote tensor has been destroyed
143+
144+
Failure to maintain the allocation for the entire lifecycle will result in undefined behavior and potential crashes.
122145

123146
Low-Level Methods for RemoteContext and RemoteTensor Creation
124147
#############################################################

0 commit comments

Comments
 (0)