Closed
Description
Go version
go version go1.23.4 linux/amd64
Output of go env
in your module/workspace:
-
What did you do?
Compile the following program with GOARCH=amd64 GOAMD64=v3 go build -gcflags=-S
package pkg
import "math"
func fooImplicit(x, y, z float64) float64 {
return x*y + z
}
func fooExplicit(x, y, z float64) float64 {
return math.FMA(x, y, z)
}
What did you see happen?
command-line-arguments.fooImplicit STEXT nosplit size=9 args=0x18 locals=0x0 funcid=0x0 align=0x0
0x0000 00000 (/home/dominikh/prj/src/example.com/bar.go:5) TEXT command-line-arguments.fooImplicit(SB), NOSPLIT|NOFRAME|ABIInternal, $0-24
0x0000 00000 (/home/dominikh/prj/src/example.com/bar.go:5) FUNCDATA $0, gclocals·g2BeySu+wFnoycgXfElmcg==(SB)
0x0000 00000 (/home/dominikh/prj/src/example.com/bar.go:5) FUNCDATA $1, gclocals·g2BeySu+wFnoycgXfElmcg==(SB)
0x0000 00000 (/home/dominikh/prj/src/example.com/bar.go:5) FUNCDATA $5, command-line-arguments.fooImplicit.arginfo1(SB)
0x0000 00000 (/home/dominikh/prj/src/example.com/bar.go:5) FUNCDATA $6, command-line-arguments.fooImplicit.argliveinfo(SB)
0x0000 00000 (/home/dominikh/prj/src/example.com/bar.go:5) PCDATA $3, $1
0x0000 00000 (/home/dominikh/prj/src/example.com/bar.go:6) MULSD X1, X0
0x0004 00004 (/home/dominikh/prj/src/example.com/bar.go:6) ADDSD X2, X0
0x0008 00008 (/home/dominikh/prj/src/example.com/bar.go:6) RET
0x0000 f2 0f 59 c1 f2 0f 58 c2 c3 ..Y...X..
command-line-arguments.fooExplicit STEXT nosplit size=9 args=0x18 locals=0x0 funcid=0x0 align=0x0
0x0000 00000 (/home/dominikh/prj/src/example.com/bar.go:9) TEXT command-line-arguments.fooExplicit(SB), NOSPLIT|NOFRAME|ABIInternal, $0-24
0x0000 00000 (/home/dominikh/prj/src/example.com/bar.go:9) FUNCDATA $0, gclocals·g2BeySu+wFnoycgXfElmcg==(SB)
0x0000 00000 (/home/dominikh/prj/src/example.com/bar.go:9) FUNCDATA $1, gclocals·g2BeySu+wFnoycgXfElmcg==(SB)
0x0000 00000 (/home/dominikh/prj/src/example.com/bar.go:9) FUNCDATA $5, command-line-arguments.fooExplicit.arginfo1(SB)
0x0000 00000 (/home/dominikh/prj/src/example.com/bar.go:9) FUNCDATA $6, command-line-arguments.fooExplicit.argliveinfo(SB)
0x0000 00000 (/home/dominikh/prj/src/example.com/bar.go:9) PCDATA $3, $1
0x0000 00000 (/home/dominikh/prj/src/example.com/bar.go:10) VFMADD231SD X1, X0, X2
0x0005 00005 (/home/dominikh/prj/src/example.com/bar.go:10) MOVUPS X2, X0
0x0008 00008 (/home/dominikh/prj/src/example.com/bar.go:10) RET
0x0000 c4 e2 f9 b9 d1 0f 10 c2 c3 .........
What did you expect to see?
I expected fooImplicit and fooExplicit to generate identical code when setting GOAMD64=v3.
On arm64, the compiler detects the x*y + z
pattern and automatically uses FMA. On amd64, math.FMA uses runtime feature detection unless the GOAMD64 environment variable is set to v3 or higher, in which case calls to math.FMA compile directly to VFMADD231SD. However, x*y + z
isn't detected, regardless of the value of GOAMD64.