You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
JIT: Support bitwise field insertions for return registers (dotnet#113178)
Based on the new `FIELD_LIST` support for returns this PR adds support for the
JIT to combine smaller fields via bitwise operations when returned, instead of
spilling these to stack.
win-x64 examples:
```csharp
static int? Test()
{
return Environment.TickCount;
}
```
```diff
call System.Environment:get_TickCount():int
- mov dword ptr [rsp+0x24], eax
- mov byte ptr [rsp+0x20], 1
- mov rax, qword ptr [rsp+0x20]
- ;; size=19 bbWeight=1 PerfScore 4.00
+ mov eax, eax
+ shl rax, 32
+ or rax, 1
+ ;; size=15 bbWeight=1 PerfScore 2.00
```
(the `mov eax, eax` is unnecessary, but not that simple to get rid of)
```csharp
static (int x, float y) Test(int x, float y)
{
return (x, y);
}
```
```diff
- mov dword ptr [rsp], ecx
- vmovss dword ptr [rsp+0x04], xmm1
- mov rax, qword ptr [rsp]
+ vmovd eax, xmm1
+ shl rax, 32
+ mov ecx, ecx
+ or rax, rcx
;; size=13 bbWeight=1 PerfScore 3.00
```
An arm64 example:
```csharp
static Memory<int> ToMemory(int[] arr)
{
return arr.AsMemory();
}
```
```diff
G_M45070_IG01: ;; offset=0x0000
- stp fp, lr, [sp, #-0x20]!
+ stp fp, lr, [sp, #-0x10]!
mov fp, sp
- str xzr, [fp, #0x10] // [V03 tmp2]
- ;; size=12 bbWeight=1 PerfScore 2.50
-G_M45070_IG02: ;; offset=0x000C
+ ;; size=8 bbWeight=1 PerfScore 1.50
+G_M45070_IG02: ;; offset=0x0008
cbz x0, G_M45070_IG06
;; size=4 bbWeight=1 PerfScore 1.00
-G_M45070_IG03: ;; offset=0x0010
- str x0, [fp, #0x10] // [V07 tmp6]
- str wzr, [fp, #0x18] // [V08 tmp7]
- ldr x0, [fp, #0x10] // [V07 tmp6]
- ldr w0, [x0, #0x08]
- str w0, [fp, #0x1C] // [V09 tmp8]
- ;; size=20 bbWeight=0.80 PerfScore 6.40
-G_M45070_IG04: ;; offset=0x0024
- ldp x0, x1, [fp, #0x10] // [V03 tmp2], [V03 tmp2+0x08]
- ;; size=4 bbWeight=1 PerfScore 3.00
-G_M45070_IG05: ;; offset=0x0028
- ldp fp, lr, [sp], #0x20
+G_M45070_IG03: ;; offset=0x000C
+ ldr w1, [x0, #0x08]
+ ;; size=4 bbWeight=0.80 PerfScore 2.40
+G_M45070_IG04: ;; offset=0x0010
+ mov w1, w1
+ mov x2, xzr
+ orr x1, x2, x1, LSL #32
+ ;; size=12 bbWeight=1 PerfScore 2.00
+G_M45070_IG05: ;; offset=0x001C
+ ldp fp, lr, [sp], #0x10
ret lr
;; size=8 bbWeight=1 PerfScore 2.00
-G_M45070_IG06: ;; offset=0x0030
- str xzr, [fp, #0x10] // [V07 tmp6]
- str xzr, [fp, #0x18]
+G_M45070_IG06: ;; offset=0x0024
+ mov x0, xzr
+ mov w1, wzr
b G_M45070_IG04
- ;; size=12 bbWeight=0.20 PerfScore 0.60
+ ;; size=12 bbWeight=0.20 PerfScore 0.40
```
(sneak peek -- this codegen requires some supplementary changes, and there's
additional opportunities here)
This is the return counterpart to dotnet#112740. That PR has a bunch of regressions
that makes it look like we need to support returns/call arguments first, before
we try to support parameters.
There's a few follow-ups here:
- Support for float->float insertions (when a float value needs to be returned
as the 1st, 2nd, .... field of a SIMD register)
- Support for coalescing memory loads, particularly because the fields of the
`FIELD_LIST` come from a promoted struct that ended up DNER. In those cases we
should be able to recombine the fields back to a single large field, instead
of combining them with bitwise operations.
- Support for constant folding the bitwise insertions. This requires some more
constant folding support in lowering.
- The JIT has lots of (now outdated) restrictions based around multi-reg returns
that get in the way. Lifting these should improve things considerably.
0 commit comments