Long-running worker OOMs in a memory-limited container: major/unified GC never triggers under sustained load (no cgroup-aware heap limit)

### Summary

A long-running `workerd serve` / `wrangler dev` worker under continuous request load grows its RSS until the kernel OOM-kills it at the container cgroup memory limit. Profiling shows V8 runs minor (scavenge) GC continuously but never runs a major / unified (cppgc) collection under load, so GC-managed objects allocated per request accumulate and are never reclaimed.

The memory is fully collectable. Forcing a GC, or letting the worker go idle, returns RSS to baseline immediately. So this is GC scheduling, not a leak: V8's heap-growth heuristic is paced against its own large internal limit, which sits well above the container's cgroup limit, so the process is killed before V8 decides a major collection is warranted.

In production Workers this is masked by per-isolate limits and eviction. In standalone/self-hosted workerd there is no equivalent bound, so a worker that is never idle climbs to the cgroup cap and OOM-loops.

### Minimal repro

`src/index.js`:
```js
export default {
  async fetch() {
    const sink = [];
    for (let i = 0; i < 20000; i++) {
      // Request/Headers/URL are JSG wrappables, allocated on the cppgc heap
      sink.push(new Request("https://example.com/p/" + i, { headers: { "x-i": String(i) } }));
    }
    return new Response("ok " + sink.length); // sink is request-scoped garbage on return
  }
};
```

`wrangler.jsonc`: `{ "name": "gc-repro", "main": "src/index.js", "compatibility_date": "2025-01-01" }`

Run with a fixed inspector port and drive load:
```
wrangler dev --local --port 8799 --inspector-port 9555
# in another shell:
for i in $(seq 1 800); do curl -s -o /dev/null http://127.0.0.1:8799/; done
```

Observed workerd RSS (wrangler 4.84.1, workerd 1.20260421.1, macOS arm64; same behavior on linux-64 in a 4 GiB cgroup):

| phase | RSS |
|---|---|
| idle | 66 MB |
| after 800 requests | 600 MB |
| after forced GC | 77 MB |

The +534 MB is not reclaimed by the constant minor GC during load. Forcing a full GC via CDP `HeapProfiler.collectGarbage` reclaims it back to baseline. Stopping load and letting the worker idle also reclaims it on its own (we saw 427 MB drop to 64 MB spontaneously). The heap is entirely collectable. It just does not get collected while the worker is under steady load.

### Evidence from a real deployment

Our worker is a Durable-Object-backed reverse proxy in a 4 GiB container under continuous traffic. The long-lived process grows about 4 MB/min, monotonic, from ~2 GB toward the cap, then the kernel cgroup OOM-killer terminates it on roughly an 8-hour cycle.

GC cadence on that process over a 180s window of continuous load (a 60s window was identical):

- `v8::internal::Heap::PerformGarbageCollection`: ~390 / 180s (~2.2/s). Minor/scavenge runs constantly.
- `v8::internal::CppHeap::CollectGarbage` (cppgc/unified): 0
- `v8::internal::Heap::CollectAllGarbage` (full): 0

cppgc page lifecycle: `PageBackend::TryAllocateNormalPageMemory` and `NormalPage::TryCreate` create pages, while `NormalPage::Destroy` and `FreeNormalPageMemory` stay at 0 (created, never freed). A net-live `malloc`/`free` balance stays flat while RSS climbs, and the dominant allocation stacks are cppgc allocating `jsg::Wrappable`s on the request path:

```
cppgc::internal::ObjectAllocator::OutOfLineAllocateImpl
  <- cppgc::internal::MakeGarbageCollectedTraitInternal::Allocate
  <- workerd::jsg::HeapTracer::allocateShim(workerd::jsg::Wrappable&)
  <- workerd::jsg::Wrappable::attachWrapper
  <- workerd::jsg::wrapOpaque<workerd::jsg::Ref<workerd::api::Response>>
```

The JS old-space stays bounded (no Mark-Compact, no `JavaScript heap out of memory` abort). RSS climbs past ~1.4 GB to the 4 GiB cgroup cap with no V8 abort, which points at the cppgc/unified heap rather than V8 old-space as the growing region.

### Relation to existing issues

This is a different failure mode from the JS-old-space-limit reports #3120 (Mark-Compact runs repeatedly at ~1.4 GB, then aborts) and #3473 (JS-heap-OOM SIGABORT under concurrency). In both of those, major GC fires and V8 aborts at its own limit. Here major/unified GC never fires, and the kernel kills the process at the cgroup limit with no V8 abort.

### Ask

A way for standalone workerd to stay within a container memory budget. Any of:

1. cgroup awareness: read `memory.max` and pace major/unified (cppgc) GC against it.
2. Respond to OS/cgroup memory-pressure notifications with a major (cppgc-sweeping) collection.
3. A configurable per-isolate or per-process heap/memory limit (capnp config and/or CLI) that paces major GC.
4. At minimum, a supported way to pass V8 flags (for example to cap old space or tune GC) from `workerd serve` or `wrangler dev`. Today neither `workerd --help` / `serve --help` nor wrangler/miniflare expose a heap, memory, GC, or `--v8-flags` option, so embedders have no lever.

### Environment

workerd `1.20260421.1`, wrangler `4.84.1`, `wrangler dev --local`. Reproduced on macOS arm64 and on linux-64 (kernel 6.17) in a 4 GiB cgroup v2 container.

### Repro notes

- wrangler's inspector proxy rejects the CDP WebSocket with `400 Expected Origin header` unless the client sends an `Origin` header.
- `HeapProfiler.collectGarbage`'s CDP response can take minutes to return on a large heap, but the collection itself completes immediately (RSS drops right away).


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Long-running worker OOMs in a memory-limited container: major/unified GC never triggers under sustained load (no cgroup-aware heap limit) #6824

Summary

Minimal repro

Evidence from a real deployment

Relation to existing issues

Ask

Environment

Repro notes

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Uh oh!

Long-running worker OOMs in a memory-limited container: major/unified GC never triggers under sustained load (no cgroup-aware heap limit) #6824

Description

Summary

Minimal repro

Evidence from a real deployment

Relation to existing issues

Ask

Environment

Repro notes

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions