Skip to content

Commit

Permalink
Update dependency com.microsoft.onnxruntime:onnxruntime to v1.19.2 (#46)
Browse files Browse the repository at this point in the history
This PR contains the following updates:

| Package | Change | Age | Adoption | Passing | Confidence |
|---|---|---|---|---|---|
|
[com.microsoft.onnxruntime:onnxruntime](https://microsoft.github.io/onnxruntime/)
([source](https://redirect.github.com/microsoft/onnxruntime)) | `1.17.1`
-> `1.19.2` |
[![age](https://developer.mend.io/api/mc/badges/age/maven/com.microsoft.onnxruntime:onnxruntime/1.19.2?slim=true)](https://docs.renovatebot.com/merge-confidence/)
|
[![adoption](https://developer.mend.io/api/mc/badges/adoption/maven/com.microsoft.onnxruntime:onnxruntime/1.19.2?slim=true)](https://docs.renovatebot.com/merge-confidence/)
|
[![passing](https://developer.mend.io/api/mc/badges/compatibility/maven/com.microsoft.onnxruntime:onnxruntime/1.17.1/1.19.2?slim=true)](https://docs.renovatebot.com/merge-confidence/)
|
[![confidence](https://developer.mend.io/api/mc/badges/confidence/maven/com.microsoft.onnxruntime:onnxruntime/1.17.1/1.19.2?slim=true)](https://docs.renovatebot.com/merge-confidence/)
|

---

### Release Notes

<details>
<summary>microsoft/onnxruntime
(com.microsoft.onnxruntime:onnxruntime)</summary>

###
[`v1.19.2`](https://redirect.github.com/microsoft/onnxruntime/releases/tag/v1.19.2):
ONNX Runtime v1.19.2

#### Announcements

- ORT 1.19.2 is a small patch release, fixing some broken workflows and
introducing bug fixes.

#### Build System & Packages

-   Fixed the signing of native DLLs.
- Disabled absl symbolize in Windows Release build to avoid dependency
on dbghelp.dll.

#### Training

- Restored support for CUDA compute capability 7.0 and 7.5 with CUDA 12,
and 6.0 and 6.1 with CUDA 11.
-   Several fixes for training CI pipelines.

#### Mobile

- Fixed ArgMaxOpBuilder::AddToModelBuilderImpl() nullptr Node access for
CoreML EP.

#### Generative AI

-   Added CUDA kernel for Phi3 MoE.
- Added smooth softmax support in CUDA and CPU kernels for the
GroupQueryAttention operator.
- Fixed number of splits calculations in GroupQueryAttention CUDA
operator.
-   Enabled causal support in the MultiHeadAttention CUDA operator.

#### Contributors

[@&#8203;prathikr](https://redirect.github.com/prathikr),
[@&#8203;mszhanyi](https://redirect.github.com/mszhanyi),
[@&#8203;edgchen1](https://redirect.github.com/edgchen1),
[@&#8203;tianleiwu](https://redirect.github.com/tianleiwu),
[@&#8203;wangyems](https://redirect.github.com/wangyems),
[@&#8203;aciddelgado](https://redirect.github.com/aciddelgado),
[@&#8203;mindest](https://redirect.github.com/mindest),
[@&#8203;snnn](https://redirect.github.com/snnn),
[@&#8203;baijumeswani](https://redirect.github.com/baijumeswani),
[@&#8203;MaanavD](https://redirect.github.com/MaanavD)

**Thanks to everyone who helped ship this release smoothly!**

**Full Changelog**:
microsoft/onnxruntime@v1.19.0...v1.19.2

###
[`v1.19.0`](https://redirect.github.com/microsoft/onnxruntime/releases/tag/v1.19.0):
ONNX Runtime v1.19.0

#### Announcements

- Note that the wrong commit was initially tagged with v1.19.0. The
final commit has since been correctly tagged:
microsoft/onnxruntime@26250ae.
This shouldn't effect much, but sorry for the inconvenience!

#### Build System & Packages

-   Numpy support for 2.x has been added
-   Qualcomm SDK has been upgraded to 2.25
-   ONNX has been upgraded from 1.16 → 1.16.1
- Default GPU packages use CUDA 12.x and Cudnn 9.x (previously CUDA
11.x/CuDNN 8.x) CUDA 11.x/CuDNN 8.x packages are moved to the aiinfra VS
feed.
-   TensorRT 10.2 support added
-   Introduced Java CUDA 12 packages on Maven.
- Discontinued support for Xamarin. (Xamarin reached EOL on May 1, 2024)
- Discontinued support for macOS 11 and increasing the minimum supported
macOS version to 12. (macOS 11 reached EOL in September 2023)
- Discontinued support for iOS 12 and increasing the minimum supported
iOS version to 13.

#### Core

-   Implemented DeformConv
- [Fixed
big-endian](https://redirect.github.com/microsoft/onnxruntime/pull/21133)
and support build on AIX

#### Performance

- Added QDQ support for INT4 quantization in CPU and CUDA Execution
Providers
- Implemented FlashAttention on CPU to improve performance for GenAI
prompt cases
-   Improved INT4 performance on CPU (X64, ARM64) and NVIDIA GPUs

#### Execution Providers

-   TensorRT
    -   Updated to support TensorRT 10.2
    -   Remove calls to deprecated api’s
- Enable refittable embedded engine when ONNX model provided as byte
stream

-   CUDA
- Upgraded cutlass to 3.5.0 for performance improvement of memory
efficient attention.
- Updated MultiHeadAttention and Attention operators to be thread-safe.
- Added sdpa_kernel provider option to choose kernel for Scaled
Dot-Product Attention.
    -   Expanded op support - Tile (bf16)

-   CPU
- Expanded op support - GroupQueryAttention, SparseAttention (for Phi-3
small)

-   QNN
    -   Updated to support QNN SDK 2.25
- Expanded op support - HardSigmoid, ConvTranspose 3d, Clip (int32
data), Matmul (int4 weights), Conv (int4 weights), prelu (fp16)
    -   Expanded fusion support – Conv + Clip/Relu fusion

-   OpenVINO
    -   Added support for OpenVINO 2024.3
    -   Support for enabling EpContext using session options

-   DirectML
    -   Updated DirectML from 1.14.1 → 1.15.1
    -   Updated ONNX opset from 17 → 20
    -   Opset 19 and Opset 20 are supported with known caveats:
        -   Gridsample 20: 5d not supported
        -   DeformConv not supported

#### Mobile

-   Additional CoreML ML Program operators were added
- See supported operators list
[here](https://redirect.github.com/microsoft/onnxruntime/blob/main/tools/ci_build/github/apple/coreml_supported_mlprogram_ops.md)
-   Fixed packaging issue with macOS framework in onnxruntime-c cocoapod
-   Removed Xamarin support
    -   Xamarin EOL was May 1, 2024
- [Xamarin official support policy | .NET
(microsoft.com)](https://dotnet.microsoft.com/en-us/platform/support/policy/xamarin)

#### Web

- Updated JavaScript packaging to align with best practices, including
slight incompatibilities when apps bundle onnxruntime-web
-   Improved CPU operators coverage for WebNN (now supported by Chrome)

#### Training

-   No specific updates

#### GenAI

-   Support for new models Qwen, Llama 3.1, Gemma 2, phi3 small
-   Support to build quantized models with method AWQ and GPTQ
-   Performance improvements for Intel and Arm CPU
-   Packing and language binding
    -   Added Java bindings (build from source)
- Separate OnnxRuntime.dll and directml.dll out of GenAI package to
improve usability
    -   Publish packages for Win Arm
    -   Support for Android (build from source)
- Bug fixes, like the [long prompt correctness issue
for](https://redirect.github.com/microsoft/onnxruntime-genai/issues/552)
phi3.

#### Extensions

- Added C APIs for language, vision and audio processors including new
FeatureExtractor for Whisper
- Support for Phi-3 Small Tokenizer and new OpenAI tiktoken format for
fast loading of BPE tokenizers
- Added new CUDA custom operators such as MulSigmoid, Transpose2DCast,
ReplaceZero, AddSharedInput and MulSharedInput
-   Enhanced Custom Op Lite API on GPU and fused kernels for DORT
- Bug fixes, including null bos_token for Qwen2 tokenizer and
SentencePiece converted FastTokenizer issue on non-ASCII characters, as
well as necessary updates for MSVC 19.40 and numpy 2.0 release

#### Contributors

Changming Sun, Baiju Meswani, Scott McKay, Edward Chen, Jian Chen,
Wanming Lin, Tianlei Wu, Adrian Lizarraga, Chester Liu, Yi Zhang, Yulong
Wang, Hector Li, kunal-vaishnavi, pengwa, aciddelgado, Yifan Li, Xu
Xing, Yufeng Li, Patrice Vignola, Yueqing Zhang, Jing Fang, Chi Lo,
Dmitri Smirnov, mingyueliuh, cloudhan, Yi-Hong Lyu, Ye Wang, Ted
Themistokleous, Guenther Schmuelling, George Wu, mindest, liqun Fu,
Preetha Veeramalai, Justin Chu, Xiang Zhang, zz002, vraspar, kailums,
guyang3532, Satya Kumar Jandhyala, Rachel Guo, Prathik Rao, Maximilian
Müller, Sophie Schoenmeyer, zhijiang, maggie1059, ivberg, glen-amd,
aamajumder, Xavier Dupré, Vincent Wang, Suryaprakash Shanmugam, Sheil
Kumar, Ranjit Ranjan, Peishen Yan, Frank Dong, Chen Feiyue, Caroline
Zhu, Adam Louly, Ștefan Talpalaru, zkep, winskuo-quic, wejoncy,
vividsnow, vivianw-amd, moyo1997, mcollinswisc, jingyanwangms, Yang Gu,
Tom McDonald, Sunghoon, Shubham Bhokare, RuomeiMS, Qingnan Duan,
PeixuanZuo, Pavan Goyal, Nikolai Svakhin, KnightYao, Jon Campbell, Johan
MEJIA, Jake Mathern, Hans, Hann Wang, Enrico Galli, Dwayne Robinson,
Clément Péron, Chip Kerchner, Chen Fu, Carson M, Adam Reeve, Adam
Pocock.

**Big thank you to everyone who contributed to this release!**

**Full Changelog**:
microsoft/onnxruntime@v1.18.1...v1.19.0

###
[`v1.18.0`](https://redirect.github.com/microsoft/onnxruntime/releases/tag/v1.18.0):
ONNX Runtime v1.18.0

#### Announcements

-   **Windows ARM32 support has been dropped at the source code level**.
- **Python version >=3.8 is now required for build.bat/build.sh**
(previously >=3.7). *Note: If you have Python version <3.8, you can
bypass the tools and use CMake directly.*
- **The
[onnxruntime-mobile](https://mvnrepository.com/artifact/com.microsoft.onnxruntime/onnxruntime-mobile)
Android package and onnxruntime-mobile-c/onnxruntime-mobile-objc iOS
cocoapods are being deprecated**. Please use the
[onnxruntime-android](https://mvnrepository.com/artifact/com.microsoft.onnxruntime/onnxruntime-android)
Android package, and onnxruntime-c/onnxruntime-objc cocoapods, which
support ONNX and ORT format models and all operators and data types.
*Note: If you require a smaller binary size, a custom build is required.
See details on creating a custom Android or iOS package on [Custom build
|
onnxruntime](https://onnxruntime.ai/docs/build/custom.html#custom-build-packages).*

#### Build System & Packages

-   CoreML execution provider now depends on coremltools.
-   Flatbuffers has been upgraded from 1.12.0 → 23.5.26.
-   ONNX has been upgraded from 1.15 → 1.16.
-   EMSDK has been upgraded from 3.1.51 → 3.1.57.
- Intel neural_speed library has been upgraded from v0.1.1 → v0.3 with
several important bug fixes.
- There is a new onnxruntime_CUDA_MINIMAL CMake option for building ONNX
Runtime CUDA execution provider without any operations apart from memcpy
ops.
-   Added support for Catalyst for macOS build support.
- Added initial support for RISC-V and three new build options for
it: `--rv64`, `--riscv_toolchain_root`, and `--riscv_qemu_path`.
- Now you can build TensorRT EP with protobuf-lite instead of the full
version of protobuf.
- Some security-related compile/link flags have been moved from the
default setting → new build
option: `--use_binskim_compliant_compile_flags`. *Note: All our release
binaries are built with this flag, but when building ONNX Runtime from
source, this flag is default OFF.*
-   Windows ARM64 build now depends on PyTorch CPUINFO library.
- Windows OneCore build now uses “Reverse forwarding” apisets instead of
“Direct forwarding”, so onnxruntime.dll in our Nuget packages will
depend on kernel32.dll. *Note: Windows systems without kernel32.dll need
to have reverse forwarders (see [API set loader operation - Win32 apps |
Microsoft
Learn](https://learn.microsoft.com/en-us/windows/win32/apiindex/api-set-loader-operation)
for more information).*

#### Core

-   Added ONNX 1.16 support.
-   Added additional optimizations related to Dynamo-exported models.
- Improved testing infrastructure for EPs developed as shared libraries.
- Exposed Reserve() in OrtAllocator to allow custom allocators to work
when session.use_device_allocator_for_initializers is specified.
-   Improved lock contention due to memory allocations.
- Improved session creation time (graph and graph transformer
optimizations).
- Added new SessionOptions config entry to disable specific transformers
and rules.
- \[C# API] Exposed SessionOptions.DisablePerSessionThreads to allow
sharing of threadpool between sessions.
-   \[Java API] Added CUDA 12 Java support.

#### Performance

-   Improved 4bit quant support:
    -   Added HQQ quantization support to improve accuracy.
- Implemented general GEMM kernel and improved GEMV kernel performance
on GPU.
    -   Improved GEMM kernel quality and performance on x64.
- Implemented general GEMM kernel and improved GEMV performance on
ARM64.
-   Improved MultiheadAttention performance on CPU.

#### Execution Providers

-   TensorRT
    -   Added support for TensorRT 10.
    -   Finalized support for DDS ops.
    -   Added Python support for user provided CUDA stream.
    -   Fixed various bugs.

-   CUDA
    -   Added support of multiple CUDA graphs.
    -   Added a provider option to disable TF32.
    -   Added Python support for user provided CUDA stream.
- Extended MoE to support of Tensor Parallelism and int4 quantization.
    -   Fixed bugs in BatchNorm and TopK kernel.

-   QNN
    -   Added support for up to QNN SDK 2.22.
- Upgraded support from A16W8 → mixed 8/16-bit precision configurability
per layer.
    -   Added fp16 execution support via enable_htp_fp16 option.
    -   Added multiple partition support for QNN context binary.
    -   Expanded operator support and fixed various bugs.
    -   Added support for per-channel quantized weights for Conv.
    -   Integration with Qualcomm’s AIHub.

-   OpenVINO
    -   Added support for up to OpenVINO 2024.1.
    -   Added support for importing pre-compiled blob as EPContext blob.
- Separated device and precision as inputs by removing support for
device_id in provider options and adding precision as separate CLI
option.
- Deprecated CPU_FP32 and GPU_FP32 terminology and introduced CPU and
GPU terminology.
    -   `AUTO:GPU,CPU` will only create GPU blob, not CPU blob.

-   DirectML
- Additional ONNX operator support: Resize-18 and Resize-19, Col2Im-18,
InNaN-20, IsInf-20, and ReduceMax-20.
- Additional contrib op support: SimplifiedLayerNormalization,
SkipSimplifiedLayerNormalization, QLinearAveragePool,
MatMulIntegerToFloat, GroupQueryAttention, DynamicQuantizeMatMul, and
QAttention.

#### Mobile

-   Improved performance of ARM64 4-bit quantization.
-   Added support for building with QNN on Android.
-   Added MacCatalyst support.
-   Added visionOS support.
-   Added initial support for creating ML Program format CoreML models.
-   Added support for 1D Conv and ConvTranspose to XNNPACK EP.

#### Web

-   Added WebNN EP preview.
-   Improved WebGPU performance (MHA, ROE).
-   Added more WebGPU and WebNN examples.
-   Increased generative model support.
-   Optimized Buffer management to reduce memory footprint.

#### Training

-   Large Model Training
    -   Added optimizations for Dynamo-exported models.
    -   Added Mixtral integration using ORT backend.
-   On-Device Training
- Added support for models >2GB to enable SLM training on edge devices.

#### GenAI

-   Added additional model support: Phi-3, Gemma, LLama-3.
-   Added DML EP support.
-   Improved tokenizer quality.
-   Improved sampling method and ORT model performance.

#### Extensions

-   Created Java packaging pipeline and published to Maven repository.
- Added support for conversion of Huggingface FastTokenizer into ONNX
custom operator.
- Unified the SentencePiece tokenizer with other Byte Pair Encoding
(BPE) based tokenizers.
-   Fixed Whisper large model pre-processing bug.
- Enabled eager execution for custom operator and refactored the header
file structure.

#### Contributors

Yi Zhang, Yulong Wang, Adrian Lizarraga, Changming Sun, Scott McKay,
Tianlei Wu, Peng Wang, Hector Li, Edward Chen, Dmitri Smirnov, Patrice
Vignola, Guenther Schmuelling, Ye Wang, Chi Lo, Wanming Lin, Xu Xing,
Baiju Meswani, Peixuan Zuo, Vincent Wang, Markus Tavenrath, Lei Cao,
Kunal Vaishnavi, Rachel Guo, Satya Kumar Jandhyala, Sheil Kumar, Yifan
Li, Jiajia Qin, Maximilian Müller, Xavier Dupré, Yi-Hong Lyu, Yufeng Li,
Alejandro Cid Delgado, Adam Louly, Prathik Rao, wejoncy, Zesong Wang,
Adam Pocock, George Wu, Jian Chen, Justin Chu, Xiaoyu, guyang3532,
Jingyan Wang, raoanag, Satya Jandhyala, Hariharan Seshadri, Jiajie Hu,
Sumit Agarwal, Peter Mcaughan, Zhijiang Xu, Abhishek Jindal, Jake
Mathern, Jeff Bloomfield, Jeff Daily, Linnea May, Phoebe Chen, Preetha
Veeramalai, Shubham Bhokare, Wei-Sheng Chin, Yang Gu, Yueqing Zhang,
Guangyun Han, inisis, ironman, Ivan Berg, Liqun Fu, Yu Luo, Rui Ren,
Sahar Fatima, snadampal, wangshuai09, Zhenze Wang, Andrew Fantino,
Andrew Grigorev, Ashwini Khade, Atanas Dimitrov, AtomicVar, Belem Zhang,
Bowen Bao, Chen Fu, Dhruv Matani, Fangrui Song, Francesco, Frank Dong,
Hans Chen, He Li, Heflin Stephen Raj, Jambay Kinley, Masayoshi Tsutsui,
Matttttt, Nanashi, Phoebe Chen, Pranav Sharma, Segev Finer, Sophie
Schoenmeyer, TP Boudreau, Ted Themistokleous, Thomas Boby, Xiang Zhang,
Yongxin Wang, Zhang Lei, aamajumder, danyue, Duansheng Liu, enximi,
fxmarty, kailums, maggie1059, mindest, mo-ja, moyo1997
**Big thank you to everyone who contributed to this release!**

###
[`v1.17.3`](https://redirect.github.com/microsoft/onnxruntime/releases/tag/v1.17.3):
ONNX Runtime v1.17.3

### What's new?

**General:**

- Update copying API header files to make Linux logic consistent with
Windows
([#&#8203;19736](https://redirect.github.com/microsoft/onnxruntime/pull/19736))
- [@&#8203;mszhanyi](https://redirect.github.com/mszhanyi)
- Pin ONNX version to fix DML and Python packaging pipeline exceptions
([#&#8203;20073](https://redirect.github.com/microsoft/onnxruntime/pull/20073))
- [@&#8203;mszhanyi](https://redirect.github.com/mszhanyi)

**Build System & Packages:**

- Fix minimal build with training APIs enabled bug affecting Apple
framework
([#&#8203;19858](https://redirect.github.com/microsoft/onnxruntime/pull/19858))
- [@&#8203;edgchen1](https://redirect.github.com/edgchen1)

**Core:**

- Fix SplitToSequence op with string tensor bug
([#&#8203;19942](https://redirect.github.com/microsoft/onnxruntime/pull/19942))
- [@&#8203;Craigacp](https://redirect.github.com/Craigacp)

**CUDA EP:**

- Fix onnxruntime_test_all build break with CUDA
([#&#8203;19673](https://redirect.github.com/microsoft/onnxruntime/pull/19673))
- [@&#8203;gedoensmax](https://redirect.github.com/gedoensmax)
- Fix broken pooling CUDA NHWC ops and ensure NCHW / NHWC parity
([#&#8203;19889](https://redirect.github.com/microsoft/onnxruntime/pull/19889))
- [@&#8203;mtavenrath](https://redirect.github.com/mtavenrath)

**TensorRT EP:**

- Fix TensorRT build break caused by image update
([#&#8203;19880](https://redirect.github.com/microsoft/onnxruntime/pull/19880))
- [@&#8203;jywu-msft](https://redirect.github.com/jywu-msft)
- Fix TensorRT custom op list concurrency bug
([#&#8203;20093](https://redirect.github.com/microsoft/onnxruntime/pull/20093))
- [@&#8203;chilo-ms](https://redirect.github.com/chilo-ms)

**Web:**

- Add hardSigmoid op support and hardSigmoid activation for fusedConv
([#&#8203;19215](https://redirect.github.com/microsoft/onnxruntime/pull/19215),
[#&#8203;19233](https://redirect.github.com/microsoft/onnxruntime/pull/19233))
- [@&#8203;qjia7](https://redirect.github.com/qjia7)
- Add support for WebNN async API with Asyncify
([#&#8203;19415](https://redirect.github.com/microsoft/onnxruntime/pull/19145))
- [@&#8203;Honry](https://redirect.github.com/Honry)
- Add uniform support for conv, conv transpose, conv grouped, and fp16
([#&#8203;18753](https://redirect.github.com/microsoft/onnxruntime/pull/18753),
[#&#8203;19098](https://redirect.github.com/microsoft/onnxruntime/pull/19098))
- [@&#8203;axinging](https://redirect.github.com/axinging)
- Add capture and replay support for JS EP
([#&#8203;18989](https://redirect.github.com/microsoft/onnxruntime/pull/18989))
- [@&#8203;fs-eire](https://redirect.github.com/fs-eire)
- Add LeakyRelu activation for fusedConv
([#&#8203;19369](https://redirect.github.com/microsoft/onnxruntime/pull/19369))
- [@&#8203;qjia7](https://redirect.github.com/qjia7)
- Add FastGelu custom op support
([#&#8203;19392](https://redirect.github.com/microsoft/onnxruntime/pull/19369))
- [@&#8203;fs-eire](https://redirect.github.com/fs-eire)
- Allow uint8 tensors for WebGPU
([#&#8203;19545](https://redirect.github.com/microsoft/onnxruntime/pull/19545))
- [@&#8203;satyajandhyala](https://redirect.github.com/satyajandhyala)
- Add and optimize MatMulNBits
([#&#8203;19852](https://redirect.github.com/microsoft/onnxruntime/pull/19852))
- [@&#8203;satyajandhyala](https://redirect.github.com/satyajandhyala)
- Enable ort-web with any Float16Array polyfill
([#&#8203;19305](https://redirect.github.com/microsoft/onnxruntime/pull/19305))
- [@&#8203;fs-eire](https://redirect.github.com/fs-eire)
- Allow multiple EPs to be specified in backend resolve logic
([#&#8203;19735](https://redirect.github.com/microsoft/onnxruntime/pull/19735))
- [@&#8203;fs-eire](https://redirect.github.com/fs-eire)
- Various bug fixes:
([#&#8203;19258](https://redirect.github.com/microsoft/onnxruntime/pull/19258))
- [@&#8203;gyagp](https://redirect.github.com/gyagp),
([#&#8203;19201](https://redirect.github.com/microsoft/onnxruntime/pull/19201),
[#&#8203;19554](https://redirect.github.com/microsoft/onnxruntime/pull/19554))
- [@&#8203;hujiajie](https://redirect.github.com/hujiajie),
([#&#8203;19262](https://redirect.github.com/microsoft/onnxruntime/pull/19262),
[#&#8203;19981](https://redirect.github.com/microsoft/onnxruntime/pull/19981))
- [@&#8203;guschmue](https://redirect.github.com/guschmue),
([#&#8203;19581](https://redirect.github.com/microsoft/onnxruntime/pull/19581),
[#&#8203;19596](https://redirect.github.com/microsoft/onnxruntime/pull/19596),
[#&#8203;19387](https://redirect.github.com/microsoft/onnxruntime/pull/19387))
- [@&#8203;axinging](https://redirect.github.com/axinging),
([#&#8203;19613](https://redirect.github.com/microsoft/onnxruntime/pull/19613))
- [@&#8203;satyajandhyala](https://redirect.github.com/satyajandhyala)
- Various improvements for performance and usability:
([#&#8203;19202](https://redirect.github.com/microsoft/onnxruntime/pull/19202))
- [@&#8203;qjia7](https://redirect.github.com/qjia7),
([#&#8203;18900](https://redirect.github.com/microsoft/onnxruntime/pull/18900),
[#&#8203;19281](https://redirect.github.com/microsoft/onnxruntime/pull/19281),
[#&#8203;18883](https://redirect.github.com/microsoft/onnxruntime/pull/18883))
- [@&#8203;axinging](https://redirect.github.com/axinging),
([#&#8203;18788](https://redirect.github.com/microsoft/onnxruntime/pull/18788),
[#&#8203;19737](https://redirect.github.com/microsoft/onnxruntime/pull/19737))
- [@&#8203;satyajandhyala](https://redirect.github.com/satyajandhyala),
([#&#8203;19610](https://redirect.github.com/microsoft/onnxruntime/pull/19610))
- [@&#8203;segevfiner](https://redirect.github.com/segevfiner),
([#&#8203;19614](https://redirect.github.com/microsoft/onnxruntime/pull/19614),
[#&#8203;19702](https://redirect.github.com/microsoft/onnxruntime/pull/19702),
[#&#8203;19677](https://redirect.github.com/microsoft/onnxruntime/pull/19677),
[#&#8203;19857](https://redirect.github.com/microsoft/onnxruntime/pull/19857),
[#&#8203;19940](https://redirect.github.com/microsoft/onnxruntime/pull/19940))
- [@&#8203;fs-eire](https://redirect.github.com/fs-eire),
([#&#8203;19791](https://redirect.github.com/microsoft/onnxruntime/pull/19791))
- [@&#8203;gyagp](https://redirect.github.com/gyagp),
([#&#8203;19868](https://redirect.github.com/microsoft/onnxruntime/pull/19868))
- [@&#8203;guschmue](https://redirect.github.com/guschmue),
([#&#8203;19433](https://redirect.github.com/microsoft/onnxruntime/pull/19433))
- [@&#8203;martholomew](https://redirect.github.com/martholomew),
([#&#8203;19932](https://redirect.github.com/microsoft/onnxruntime/pull/19932))
- [@&#8203;ibelem](https://redirect.github.com/ibelem)

**Windows:**

- Fix Windows memory mapping bug affecting some larger models
([#&#8203;19623](https://redirect.github.com/microsoft/onnxruntime/pull/19623))
- [@&#8203;yufenglee](https://redirect.github.com/yufenglee)

**Kernel Optimizations:**

- Fix GQA and Rotary Embedding bugs affecting some models
([#&#8203;19801](https://redirect.github.com/microsoft/onnxruntime/pull/19801),
[#&#8203;19874](https://redirect.github.com/microsoft/onnxruntime/pull/19874))
- [@&#8203;aciddelgado](https://redirect.github.com/aciddelgado)
- Update replacement of MultiHeadAttention (MHA) and GroupQueryAttention
(GQA)
([#&#8203;19882](https://redirect.github.com/microsoft/onnxruntime/pull/19882))
- [@&#8203;kunal-vaishnavi](https://redirect.github.com/kunal-vaishnavi)
- Add support for packed QKV input and Rotary Embedding with sm<80 using
Memory Efficient Attention kernel
([#&#8203;20012](https://redirect.github.com/microsoft/onnxruntime/pull/20012))
- [@&#8203;aciddelgado](https://redirect.github.com/aciddelgado)

**Models:**

- Add support for benchmarking LLaMA model end-to-end performance
([#&#8203;19985](https://redirect.github.com/microsoft/onnxruntime/pull/19985),
[#&#8203;20033](https://redirect.github.com/microsoft/onnxruntime/pull/20033),
[#&#8203;20149](https://redirect.github.com/microsoft/onnxruntime/pull/20149))
- [@&#8203;kunal-vaishnavi](https://redirect.github.com/kunal-vaishnavi)
- Add example to demonstrate export of Open AI Whisper implementation
with batched prompts
([#&#8203;19854](https://redirect.github.com/microsoft/onnxruntime/pull/19854))
- [@&#8203;shubhambhokare1](https://redirect.github.com/shubhambhokare1)

This patch release also includes additional fixes by
[@&#8203;spampana95](https://redirect.github.com/spampana95) and
[@&#8203;enximi](https://redirect.github.com/enximi). **Big thank you to
all our contributors!**

</details>

---

### Configuration

📅 **Schedule**: Branch creation - At any time (no schedule defined),
Automerge - At any time (no schedule defined).

🚦 **Automerge**: Disabled by config. Please merge this manually once you
are satisfied.

♻ **Rebasing**: Whenever PR is behind base branch, or you tick the
rebase/retry checkbox.

🔕 **Ignore**: Close this PR and you won't be reminded about this update
again.

---

- [ ] <!-- rebase-check -->If you want to rebase/retry this PR, check
this box

---

This PR was generated by [Mend Renovate](https://mend.io/renovate/).
View the [repository job
log](https://developer.mend.io/github/langchain4j/langchain4j-embeddings).

<!--renovate-debug:eyJjcmVhdGVkSW5WZXIiOiIzOC4xMTUuMSIsInVwZGF0ZWRJblZlciI6IjM4LjExNS4xIiwidGFyZ2V0QnJhbmNoIjoibWFpbiIsImxhYmVscyI6W119-->

Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>
  • Loading branch information
renovate[bot] authored Oct 14, 2024
1 parent faab2d2 commit e65348d
Showing 1 changed file with 1 addition and 1 deletion.
2 changes: 1 addition & 1 deletion langchain4j-embeddings/pom.xml
Original file line number Diff line number Diff line change
Expand Up @@ -30,7 +30,7 @@
<dependency>
<groupId>com.microsoft.onnxruntime</groupId>
<artifactId>onnxruntime</artifactId>
<version>1.17.1</version>
<version>1.19.2</version>
</dependency>

<dependency>
Expand Down

0 comments on commit e65348d

Please sign in to comment.