-
Notifications
You must be signed in to change notification settings - Fork 18.3k
Description
Go version
go version go1.24.4 linux/amd64
Output of go env
in your module/workspace:
AR='ar'
CC='gcc'
CGO_CFLAGS='-O2 -g'
CGO_CPPFLAGS=''
CGO_CXXFLAGS='-O2 -g'
CGO_ENABLED='1'
CGO_FFLAGS='-O2 -g'
CGO_LDFLAGS='-O2 -g'
CXX='g++'
GCCGO='gccgo'
GO111MODULE=''
GOAMD64='v1'
GOARCH='amd64'
GOAUTH='netrc'
GOBIN=''
GOCACHE='/usr/local/google/home/mpratt/.cache/go-build'
GOCACHEPROG=''
GODEBUG=''
GOENV='/usr/local/google/home/mpratt/.config/go/env'
GOEXE=''
GOEXPERIMENT=''
GOFIPS140='off'
GOFLAGS=''
GOGCCFLAGS='-fPIC -m64 -pthread -Wl,--no-gc-sections -fmessage-length=0 -ffile-prefix-map=/tmp/go-build4227360042=/tmp/go-build -gno-record-gcc-switches'
GOHOSTARCH='amd64'
GOHOSTOS='linux'
GOINSECURE=''
GOMOD='/tmp/memprofile_leak/go.mod'
GOMODCACHE='/usr/local/google/home/mpratt/go/pkg/mod'
GONOPROXY=''
GONOSUMDB=''
GOOS='linux'
GOPATH='/usr/local/google/home/mpratt/go'
GOPRIVATE=''
GOPROXY='https://proxy.golang.org,direct'
GOROOT='/usr/lib/go-1.24'
GOSUMDB='sum.golang.org'
GOTELEMETRY='on'
GOTELEMETRYDIR='/usr/local/google/home/mpratt/.config/go/telemetry'
GOTMPDIR=''
GOTOOLCHAIN='auto'
GOTOOLDIR='/usr/lib/go-1.24/pkg/tool/linux_amd64'
GOVCS=''
GOVERSION='go1.24.4'
GOWORK=''
PKG_CONFIG='pkg-config'
What did you do?
Reproducer: https://go.dev/play/p/uJ6XPZzNcEF (must run locally, it requires real time)
Run a program with a very large memory profile and call runtime.MemProfile
to collect the profile occasionally.
What did you see happen?
As the profile grows in size, calls to runtime.MemProfile
take longer and longer. This isn't surprising, as the profile is growing. What is surprising is that STW times go up a lot.
e.g., after ~60s, I see output like:
MemProfile: took 711.959904ms
GC STW stop time (ns):
| 0 [ -Inf, 0.0)
| 0 [ 0.0, 64.0)
| 0 [ 64.0, 128.0)
| ...
| 0 [ 384.0, 448.0)
| 0 [ 448.0, 512.0)
*| 2 [ 512.0, 640.0)
*****| 6 [ 640.0, 768.0)
********************| 24 [ 768.0, 896.0)
**************| 17 [ 896.0, 1024.0)
*******| 9 [1024.0, 1280.0)
***| 4 [1280.0, 1536.0)
*| 2 [1536.0, 1792.0)
| 0 [1792.0, 2048.0)
| 0 [2048.0, 2560.0)
| ...
| 0 [12288.0, 14336.0)
| 0 [14336.0, 16384.0)
*| 2 [16384.0, 20480.0)
| 0 [20480.0, 24576.0)
***| 4 [24576.0, 28672.0)
***| 4 [28672.0, 32768.0)
*****| 7 [32768.0, 40960.0)
******| 8 [40960.0, 49152.0)
| 1 [49152.0, 57344.0)
*| 2 [57344.0, 65536.0)
*| 2 [65536.0, 81920.0)
| 0 [81920.0, 98304.0)
| 0 [98304.0, 114688.0)
| ...
| 0 [117440512.0, 134217728.0)
| 0 [134217728.0, 167772160.0)
| 1 [167772160.0, 201326592.0)
| 0 [201326592.0, 234881024.0)
| 0 [234881024.0, 268435456.0)
| 0 [268435456.0, 335544320.0)
| 1 [335544320.0, 402653184.0)
| 0 [402653184.0, 469762048.0)
| 0 [469762048.0, 536870912.0)
| 1 [536870912.0, 671088640.0)
| 0 [671088640.0, 805306368.0)
| 0 [805306368.0, 939524096.0)
| ...
| 0 [105553116266496.0, 123145302310912.0)
| 0 [123145302310912.0, 140737488355328.0)
| 0 [140737488355328.0, +Inf)
This shows 1 sample of a STW stop time of ~530-670ms.
I believe the problem is that runtime.memProfileInternal
holds profMemActiveLock
for its entire duration, which prevents preemption from stopping the goroutine.
What did you expect to see?
Calling runtime.MemProfile
should not impact STW times.
My reproducer intentionally creates tons of unique stacks to make the profile very large. That has its own issues (#69590) with memory usage, but I don't believe it should impact STW times.
Metadata
Metadata
Assignees
Labels
Type
Projects
Status