-
Notifications
You must be signed in to change notification settings - Fork 2.8k
[CPU][ARM] ARM NEON load/store fix for int8/uint8 #32781
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
| if (broadcast) { | ||
| utils::load_vector(data.b, data.s, ptr_reg, ptr_offset, broadcast, this); | ||
| } else { | ||
| const size_t lane_count = cpu_isa_traits<isa>::vlen / dst_prc.size(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does it make sense to move this logic to utils::load_vector? As for me, it seems every time we want to load_vector from anywhere else we actually want to preserve boundaries correctly in case of broadcast == false
aobolensk
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
BTW, do we need to reflect this fix for store?
|
@aobolensk done |
8afa0ac to
6d50594
Compare
d5e9b90 to
fce68eb
Compare
…fset-safe load/store handling. Updated the i8/u8 path in jit_uni_eltwise_generic<…>::store_vector to use the new helper with proper lane_count, mirroring the load-side fix.
636f2b3 to
7829f1c
Compare
Implemented an ARM NEON load/store fix so int8/uint8 vectors are read lane‑by‑lane instead of pulling a full 16‑byte block and overrunning buffers. In src/plugins/intel_cpu/src/nodes/kernels/aarch64/jit_uni_eltwise_generic.cpp the i8/u8 branch now uses ld1 on individual byte lanes (with broadcast still using the old helper) and only then performs the sign/zero extends, preventing illegal memory accesses when mixed‑precision eltwise ops hit the JIT.