Skip to content

runtime: long calls to MemProfile cause unbounded STW stop time #74872

@prattmic

Description

@prattmic

Go version

go version go1.24.4 linux/amd64

Output of go env in your module/workspace:

AR='ar'
CC='gcc'
CGO_CFLAGS='-O2 -g'
CGO_CPPFLAGS=''
CGO_CXXFLAGS='-O2 -g'
CGO_ENABLED='1'
CGO_FFLAGS='-O2 -g'
CGO_LDFLAGS='-O2 -g'
CXX='g++'
GCCGO='gccgo'
GO111MODULE=''
GOAMD64='v1'
GOARCH='amd64'
GOAUTH='netrc'
GOBIN=''
GOCACHE='/usr/local/google/home/mpratt/.cache/go-build'
GOCACHEPROG=''
GODEBUG=''
GOENV='/usr/local/google/home/mpratt/.config/go/env'
GOEXE=''
GOEXPERIMENT=''
GOFIPS140='off'
GOFLAGS=''
GOGCCFLAGS='-fPIC -m64 -pthread -Wl,--no-gc-sections -fmessage-length=0 -ffile-prefix-map=/tmp/go-build4227360042=/tmp/go-build -gno-record-gcc-switches'
GOHOSTARCH='amd64'
GOHOSTOS='linux'
GOINSECURE=''
GOMOD='/tmp/memprofile_leak/go.mod'
GOMODCACHE='/usr/local/google/home/mpratt/go/pkg/mod'
GONOPROXY=''
GONOSUMDB=''
GOOS='linux'
GOPATH='/usr/local/google/home/mpratt/go'
GOPRIVATE=''
GOPROXY='https://proxy.golang.org,direct'
GOROOT='/usr/lib/go-1.24'
GOSUMDB='sum.golang.org'
GOTELEMETRY='on'
GOTELEMETRYDIR='/usr/local/google/home/mpratt/.config/go/telemetry'
GOTMPDIR=''
GOTOOLCHAIN='auto'
GOTOOLDIR='/usr/lib/go-1.24/pkg/tool/linux_amd64'
GOVCS=''
GOVERSION='go1.24.4'
GOWORK=''
PKG_CONFIG='pkg-config'

What did you do?

Reproducer: https://go.dev/play/p/uJ6XPZzNcEF (must run locally, it requires real time)

Run a program with a very large memory profile and call runtime.MemProfile to collect the profile occasionally.

What did you see happen?

As the profile grows in size, calls to runtime.MemProfile take longer and longer. This isn't surprising, as the profile is growing. What is surprising is that STW times go up a lot.

e.g., after ~60s, I see output like:

MemProfile: took 711.959904ms
GC STW stop time (ns):
                    |      0 [  -Inf,    0.0)
                    |      0 [   0.0,   64.0)
                    |      0 [  64.0,  128.0)
                    | ...
                    |      0 [ 384.0,  448.0)
                    |      0 [ 448.0,  512.0)
                   *|      2 [ 512.0,  640.0)
               *****|      6 [ 640.0,  768.0)
********************|     24 [ 768.0,  896.0)
      **************|     17 [ 896.0, 1024.0)
             *******|      9 [1024.0, 1280.0)
                 ***|      4 [1280.0, 1536.0)
                   *|      2 [1536.0, 1792.0)
                    |      0 [1792.0, 2048.0)
                    |      0 [2048.0, 2560.0)
                    | ...
                    |      0 [12288.0, 14336.0)
                    |      0 [14336.0, 16384.0)
                   *|      2 [16384.0, 20480.0)
                    |      0 [20480.0, 24576.0)
                 ***|      4 [24576.0, 28672.0)
                 ***|      4 [28672.0, 32768.0)
               *****|      7 [32768.0, 40960.0)
              ******|      8 [40960.0, 49152.0)
                    |      1 [49152.0, 57344.0)
                   *|      2 [57344.0, 65536.0)
                   *|      2 [65536.0, 81920.0)
                    |      0 [81920.0, 98304.0)
                    |      0 [98304.0, 114688.0)
                    | ...
                    |      0 [117440512.0, 134217728.0)
                    |      0 [134217728.0, 167772160.0)
                    |      1 [167772160.0, 201326592.0)
                    |      0 [201326592.0, 234881024.0)
                    |      0 [234881024.0, 268435456.0)
                    |      0 [268435456.0, 335544320.0)
                    |      1 [335544320.0, 402653184.0)
                    |      0 [402653184.0, 469762048.0)
                    |      0 [469762048.0, 536870912.0)
                    |      1 [536870912.0, 671088640.0)
                    |      0 [671088640.0, 805306368.0)
                    |      0 [805306368.0, 939524096.0)
                    | ...
                    |      0 [105553116266496.0, 123145302310912.0)
                    |      0 [123145302310912.0, 140737488355328.0)
                    |      0 [140737488355328.0,   +Inf)

This shows 1 sample of a STW stop time of ~530-670ms.

I believe the problem is that runtime.memProfileInternal holds profMemActiveLock for its entire duration, which prevents preemption from stopping the goroutine.

What did you expect to see?

Calling runtime.MemProfile should not impact STW times.

My reproducer intentionally creates tons of unique stacks to make the profile very large. That has its own issues (#69590) with memory usage, but I don't believe it should impact STW times.

Metadata

Metadata

Assignees

No one assigned

    Labels

    NeedsInvestigationSomeone must examine and confirm this is a valid issue and not a duplicate of an existing one.compiler/runtimeIssues related to the Go compiler and/or runtime.

    Type

    No type

    Projects

    Status

    Todo

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions