Group_norm backward kernel optimization part 2 #1719

yucai-intel · 2025-06-04T09:07:01Z

The shape of derivative calculation was adjusted to optimize memory access efficiency, which increased groupnorm backward kernel latency by ~80% when affine=false (e.g. latency107523ms->17823ms, shape=[32，512，256，256]).

yucai-intel and others added 3 commits June 4, 2025 02:05

add dummy gamma

a47738f

revise

ff893a0

Update GroupNormKernels.cpp

21e37fd

yucai-intel marked this pull request as ready for review June 9, 2025 02:52

yucai-intel added the kernel_optimization label Jun 9, 2025

yucai-intel and others added 3 commits June 9, 2025 20:09

Merge branch 'main' into yucai/gn_dummy

2b5b486

fix err

dfd8aa9

Update GroupNormKernels.cpp

1ca5264

yucai-intel requested a review from toyxu June 10, 2025 06:02

yucai-intel changed the title ~~Group_norm backward kernel optimazation part 2~~ Group_norm backward kernel optimization part 2 Jun 10, 2025

yucai-intel added 2 commits June 11, 2025 10:23

Update GroupNormKernels.cpp

5b5be4d

Merge branch 'main' into yucai/gn_dummy

7035673

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Group_norm backward kernel optimization part 2 #1719

Group_norm backward kernel optimization part 2 #1719

Uh oh!

yucai-intel commented Jun 4, 2025 •

edited

Loading

Uh oh!

Uh oh!

Group_norm backward kernel optimization part 2 #1719

Are you sure you want to change the base?

Group_norm backward kernel optimization part 2 #1719

Uh oh!

Conversation

yucai-intel commented Jun 4, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

yucai-intel commented Jun 4, 2025 •

edited

Loading