-
Notifications
You must be signed in to change notification settings - Fork 3.4k
Pull requests: microsoft/onnxruntime
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
[webgpu] Apply Flash Attention if sliding window exceeds KV cache length
#25594
opened Jul 30, 2025 by
daijh
Loading…
add session_id_ to LogEvaluationStart/Stop, LogSessionCreationStart
#25590
opened Jul 30, 2025 by
xieofxie
Loading…
Cache opSupportLimits to improve the performance and update tracing e…
#25589
opened Jul 30, 2025 by
qwu16
Loading…
Refactor code to prevent internal structure from leaking outside Graph class
#25586
opened Jul 30, 2025 by
yuslepukhin
•
Draft
NEON kernels for NCHWc Convolution and Pooling
#25580
opened Jul 29, 2025 by
Rohanjames1997
Loading…
Draft: [NV TRT RTX EP] Fix onnx checker for constants in subgraph
#25579
opened Jul 29, 2025 by
gedoensmax
Loading…
Add CUDA implementation of GatherBlockQuantized operator
#25575
opened Jul 29, 2025 by
xiaomsft
Loading…
[Web] Avoid unnecessary data copy for pre-allocated tensors
ep:WebNN
WebNN execution provider
#25571
opened Jul 29, 2025 by
Honry
Loading…
[EP ABI] Support for TENSOR type attribute
release:1.23.0
#25566
opened Jul 28, 2025 by
chilo-ms
Loading…
[NV TRT RTX EP] Leverage ORT allocator for workspace allocations
#25564
opened Jul 28, 2025 by
gedoensmax
Loading…
[CUDA EP] Add hardswish op and add bf16 support for harsigmoid
#25562
opened Jul 28, 2025 by
Stonesjtu
Loading…
Fix the GQA documentation for dimension of sin and cos cache.
#25559
opened Jul 28, 2025 by
gaugarg-nv
Loading…
[webgpu] Add more GEMM test
ep:WebGPU
ort-web webgpu provider
#25556
opened Jul 28, 2025 by
xiaofeihan1
Loading…
Previous Next
ProTip!
Filter pull requests by the default branch with base:main.