Skip to content

cmd/compile: shorter slicing instruction sequence on x86/amd64 #47969

Open
@martisch

Description

@martisch

Compiling on tip:

func Slice(x []int, y int) []int {
    return x[y:]
}

and cutting out the function prologue on amd64 we get:

CMPQ    BX, DI
JCS     panicpath
SUBQ    DI, CX
SUBQ    DI, BX
MOVQ    CX, DX
NEGQ    CX
SHLQ    $3, DI
SARQ    $63, CX
ANDQ    CX, DI
ADDQ    DI, AX
MOVQ    DX, CX

We should be able to slim that down to (not tested):

CMPQ    BX, DI
JCS     panicpath
SUBQ    DI, CX
SUBQ    DI, BX
SHLQ    $3, DI
TESTQ   CX, CX
CMOVZ   CX, DI
ADDQ    DI, AX

by pulling the AND and OpSlicemask operation in the ssa generation phase into a single new OpSlicedelta operation:

before:

mask := s.newValue1(ssa.OpSlicemask, types.Types[types.TINT], rcap)
delta = s.newValue2(andOp, types.Types[types.TINT], delta, mask)

after:

delta = s.newValue2(ssa.OpSlicedelta, types.Types[types.TINT], delta, rcap)

By either making the compiler SSA optimizations smarter or pulling even more operations into a special SSA Op we could save the TESTQ and be able to get to:

CMPQ    BX, DI
JCS     panicpath
SUBQ    DI, BX
SUBQ    DI, CX
CMOVE   CX, DI
SHLQ    $3, DI
ADDQ    DI, AX

However it is unclear if this will be any faster (or worth the complexity) without benchmarking when the scaling of the index for the delta happens after the CMOV.

A further reduction in instructions is possible by moving the panic jumps to be dependent on the SUB instructions:

SUBQ    DI, BX
JS      panicpath
SUBQ    DI, CX
CMOVE   CX, DI
SHLQ    $3, DI
ADDQ    DI, AX

That then will need extra handling in recovering the original slice len/cap in the panicpath.

At last for this specific case the SHL and ADD can be folded into a LEA:

SUBQ    DI, BX
JS      panicpath
SUBQ    DI, CX
CMOVE   CX, DI
LEAQ    [AX+DI*8], AX

/cc @randall77 @josharian @mdempsky

Metadata

Metadata

Assignees

No one assigned

    Labels

    NeedsInvestigationSomeone must examine and confirm this is a valid issue and not a duplicate of an existing one.Performancebinary-sizecompiler/runtimeIssues related to the Go compiler and/or runtime.

    Type

    No type

    Projects

    Status

    Triage Backlog

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions