Skip to content

Long-running worker OOMs in a memory-limited container: major/unified GC never triggers under sustained load (no cgroup-aware heap limit) #6824

Description

@com6056

Summary

A long-running workerd serve / wrangler dev worker under continuous request load grows its RSS until the kernel OOM-kills it at the container cgroup memory limit. Profiling shows V8 runs minor (scavenge) GC continuously but never runs a major / unified (cppgc) collection under load, so GC-managed objects allocated per request accumulate and are never reclaimed.

The memory is fully collectable. Forcing a GC, or letting the worker go idle, returns RSS to baseline immediately. So this is GC scheduling, not a leak: V8's heap-growth heuristic is paced against its own large internal limit, which sits well above the container's cgroup limit, so the process is killed before V8 decides a major collection is warranted.

In production Workers this is masked by per-isolate limits and eviction. In standalone/self-hosted workerd there is no equivalent bound, so a worker that is never idle climbs to the cgroup cap and OOM-loops.

Minimal repro

src/index.js:

export default {
  async fetch() {
    const sink = [];
    for (let i = 0; i < 20000; i++) {
      // Request/Headers/URL are JSG wrappables, allocated on the cppgc heap
      sink.push(new Request("https://example.com/p/" + i, { headers: { "x-i": String(i) } }));
    }
    return new Response("ok " + sink.length); // sink is request-scoped garbage on return
  }
};

wrangler.jsonc: { "name": "gc-repro", "main": "src/index.js", "compatibility_date": "2025-01-01" }

Run with a fixed inspector port and drive load:

wrangler dev --local --port 8799 --inspector-port 9555
# in another shell:
for i in $(seq 1 800); do curl -s -o /dev/null http://127.0.0.1:8799/; done

Observed workerd RSS (wrangler 4.84.1, workerd 1.20260421.1, macOS arm64; same behavior on linux-64 in a 4 GiB cgroup):

phase RSS
idle 66 MB
after 800 requests 600 MB
after forced GC 77 MB

The +534 MB is not reclaimed by the constant minor GC during load. Forcing a full GC via CDP HeapProfiler.collectGarbage reclaims it back to baseline. Stopping load and letting the worker idle also reclaims it on its own (we saw 427 MB drop to 64 MB spontaneously). The heap is entirely collectable. It just does not get collected while the worker is under steady load.

Evidence from a real deployment

Our worker is a Durable-Object-backed reverse proxy in a 4 GiB container under continuous traffic. The long-lived process grows about 4 MB/min, monotonic, from ~2 GB toward the cap, then the kernel cgroup OOM-killer terminates it on roughly an 8-hour cycle.

GC cadence on that process over a 180s window of continuous load (a 60s window was identical):

  • v8::internal::Heap::PerformGarbageCollection: ~390 / 180s (~2.2/s). Minor/scavenge runs constantly.
  • v8::internal::CppHeap::CollectGarbage (cppgc/unified): 0
  • v8::internal::Heap::CollectAllGarbage (full): 0

cppgc page lifecycle: PageBackend::TryAllocateNormalPageMemory and NormalPage::TryCreate create pages, while NormalPage::Destroy and FreeNormalPageMemory stay at 0 (created, never freed). A net-live malloc/free balance stays flat while RSS climbs, and the dominant allocation stacks are cppgc allocating jsg::Wrappables on the request path:

cppgc::internal::ObjectAllocator::OutOfLineAllocateImpl
  <- cppgc::internal::MakeGarbageCollectedTraitInternal::Allocate
  <- workerd::jsg::HeapTracer::allocateShim(workerd::jsg::Wrappable&)
  <- workerd::jsg::Wrappable::attachWrapper
  <- workerd::jsg::wrapOpaque<workerd::jsg::Ref<workerd::api::Response>>

The JS old-space stays bounded (no Mark-Compact, no JavaScript heap out of memory abort). RSS climbs past ~1.4 GB to the 4 GiB cgroup cap with no V8 abort, which points at the cppgc/unified heap rather than V8 old-space as the growing region.

Relation to existing issues

This is a different failure mode from the JS-old-space-limit reports #3120 (Mark-Compact runs repeatedly at ~1.4 GB, then aborts) and #3473 (JS-heap-OOM SIGABORT under concurrency). In both of those, major GC fires and V8 aborts at its own limit. Here major/unified GC never fires, and the kernel kills the process at the cgroup limit with no V8 abort.

Ask

A way for standalone workerd to stay within a container memory budget. Any of:

  1. cgroup awareness: read memory.max and pace major/unified (cppgc) GC against it.
  2. Respond to OS/cgroup memory-pressure notifications with a major (cppgc-sweeping) collection.
  3. A configurable per-isolate or per-process heap/memory limit (capnp config and/or CLI) that paces major GC.
  4. At minimum, a supported way to pass V8 flags (for example to cap old space or tune GC) from workerd serve or wrangler dev. Today neither workerd --help / serve --help nor wrangler/miniflare expose a heap, memory, GC, or --v8-flags option, so embedders have no lever.

Environment

workerd 1.20260421.1, wrangler 4.84.1, wrangler dev --local. Reproduced on macOS arm64 and on linux-64 (kernel 6.17) in a 4 GiB cgroup v2 container.

Repro notes

  • wrangler's inspector proxy rejects the CDP WebSocket with 400 Expected Origin header unless the client sends an Origin header.
  • HeapProfiler.collectGarbage's CDP response can take minutes to return on a large heap, but the collection itself completes immediately (RSS drops right away).

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions