runtime: long calls to MemProfile cause unbounded STW stop time

### Go version

go version go1.24.4 linux/amd64

### Output of `go env` in your module/workspace:

```shell
AR='ar'
CC='gcc'
CGO_CFLAGS='-O2 -g'
CGO_CPPFLAGS=''
CGO_CXXFLAGS='-O2 -g'
CGO_ENABLED='1'
CGO_FFLAGS='-O2 -g'
CGO_LDFLAGS='-O2 -g'
CXX='g++'
GCCGO='gccgo'
GO111MODULE=''
GOAMD64='v1'
GOARCH='amd64'
GOAUTH='netrc'
GOBIN=''
GOCACHE='/usr/local/google/home/mpratt/.cache/go-build'
GOCACHEPROG=''
GODEBUG=''
GOENV='/usr/local/google/home/mpratt/.config/go/env'
GOEXE=''
GOEXPERIMENT=''
GOFIPS140='off'
GOFLAGS=''
GOGCCFLAGS='-fPIC -m64 -pthread -Wl,--no-gc-sections -fmessage-length=0 -ffile-prefix-map=/tmp/go-build4227360042=/tmp/go-build -gno-record-gcc-switches'
GOHOSTARCH='amd64'
GOHOSTOS='linux'
GOINSECURE=''
GOMOD='/tmp/memprofile_leak/go.mod'
GOMODCACHE='/usr/local/google/home/mpratt/go/pkg/mod'
GONOPROXY=''
GONOSUMDB=''
GOOS='linux'
GOPATH='/usr/local/google/home/mpratt/go'
GOPRIVATE=''
GOPROXY='https://proxy.golang.org,direct'
GOROOT='/usr/lib/go-1.24'
GOSUMDB='sum.golang.org'
GOTELEMETRY='on'
GOTELEMETRYDIR='/usr/local/google/home/mpratt/.config/go/telemetry'
GOTMPDIR=''
GOTOOLCHAIN='auto'
GOTOOLDIR='/usr/lib/go-1.24/pkg/tool/linux_amd64'
GOVCS=''
GOVERSION='go1.24.4'
GOWORK=''
PKG_CONFIG='pkg-config'
```

### What did you do?

Reproducer: https://go.dev/play/p/uJ6XPZzNcEF (must run locally, it requires real time)

Run a program with a very large memory profile and call `runtime.MemProfile` to collect the profile occasionally.

### What did you see happen?

As the profile grows in size, calls to `runtime.MemProfile` take longer and longer. This isn't surprising, as the profile is growing. What is surprising is that STW times go up a lot.

e.g., after ~60s, I see output like:

```
MemProfile: took 711.959904ms
GC STW stop time (ns):
                    |      0 [  -Inf,    0.0)
                    |      0 [   0.0,   64.0)
                    |      0 [  64.0,  128.0)
                    | ...
                    |      0 [ 384.0,  448.0)
                    |      0 [ 448.0,  512.0)
                   *|      2 [ 512.0,  640.0)
               *****|      6 [ 640.0,  768.0)
********************|     24 [ 768.0,  896.0)
      **************|     17 [ 896.0, 1024.0)
             *******|      9 [1024.0, 1280.0)
                 ***|      4 [1280.0, 1536.0)
                   *|      2 [1536.0, 1792.0)
                    |      0 [1792.0, 2048.0)
                    |      0 [2048.0, 2560.0)
                    | ...
                    |      0 [12288.0, 14336.0)
                    |      0 [14336.0, 16384.0)
                   *|      2 [16384.0, 20480.0)
                    |      0 [20480.0, 24576.0)
                 ***|      4 [24576.0, 28672.0)
                 ***|      4 [28672.0, 32768.0)
               *****|      7 [32768.0, 40960.0)
              ******|      8 [40960.0, 49152.0)
                    |      1 [49152.0, 57344.0)
                   *|      2 [57344.0, 65536.0)
                   *|      2 [65536.0, 81920.0)
                    |      0 [81920.0, 98304.0)
                    |      0 [98304.0, 114688.0)
                    | ...
                    |      0 [117440512.0, 134217728.0)
                    |      0 [134217728.0, 167772160.0)
                    |      1 [167772160.0, 201326592.0)
                    |      0 [201326592.0, 234881024.0)
                    |      0 [234881024.0, 268435456.0)
                    |      0 [268435456.0, 335544320.0)
                    |      1 [335544320.0, 402653184.0)
                    |      0 [402653184.0, 469762048.0)
                    |      0 [469762048.0, 536870912.0)
                    |      1 [536870912.0, 671088640.0)
                    |      0 [671088640.0, 805306368.0)
                    |      0 [805306368.0, 939524096.0)
                    | ...
                    |      0 [105553116266496.0, 123145302310912.0)
                    |      0 [123145302310912.0, 140737488355328.0)
                    |      0 [140737488355328.0,   +Inf)
```

This shows 1 sample of a STW stop time of ~530-670ms.

I believe the problem is that `runtime.memProfileInternal` holds `profMemActiveLock` for its entire duration, which prevents preemption from stopping the goroutine.

### What did you expect to see?

Calling `runtime.MemProfile` should not impact STW times.

My reproducer intentionally creates tons of unique stacks to make the profile very large. That has its own issues (#69590) with memory usage, but I don't believe it should impact STW times.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

runtime: long calls to MemProfile cause unbounded STW stop time #74872

Go version

Output of `go env` in your module/workspace:

What did you do?

What did you see happen?

What did you expect to see?

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

runtime: long calls to MemProfile cause unbounded STW stop time #74872

Description

Go version

Output of go env in your module/workspace:

What did you do?

What did you see happen?

What did you expect to see?

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

Output of `go env` in your module/workspace: