Madvise Transparent Huge Pages for large allocations #59858

artemsolod · 2025-10-15T21:58:40Z

To get things going with #56521 I've made a minimal implementation that mirrors one from numpy (https://github.com/numpy/numpy/blob/7c0e2e4224c6feb04a2ac4aa851f49a2c2f6189f/numpy/_core/src/multiarray/alloc.c#L113).

What this does: changes jl_gc_managed_malloc(size_t sz) to check whether requested allocation is big enough to benefit from huge pages. And if so, we ensure the allocation is page aligned and then appropriate madvise is called on the memory pointer.

For a simple "fill memory" test I see around 2x timing improvement.

function f(N)
    mem =  Memory{Int}(undef, N)
    mem .= 0
    mem[end]
end

f(1)
@time f(1_000_000)

0.001464 seconds (2 allocations: 7.633 MiB) # this branch
0.003431 seconds (2 allocations: 7.633 MiB) # master

I would appreciate help with this PR as I have no experience writing C code and little knowledge of julia internals. In particular I think it would make sense to have a startup option controlling minimal eligible allocation size which should default to system's hugepage size - for this initial implementation the same constant as in numpy is hardcoded.

Keno · 2025-10-15T23:54:35Z

What kernel are you on? THP is usually automatic.

oscardssmith · 2025-10-16T02:45:08Z

IIUC it requires alignment so you don't get it unless you ask for them. The transparent part is that they aren't specially segmented memory.

Keno · 2025-10-16T02:53:01Z

huge pages always need to be aligned. transparent means that they're ordinary pages, rather than being mmap'd from hugetlb which is the (very) old way to get hugepages. But regardless, modern kernel should automatically assign huge pages to sufficiently large mappings that it thinks are used. My suspicion here is that the reported perf difference isn't actually due to hugepages, but rather the fact that for the initial allocation the hugepage advice overwrites the fault granularity. We might see even better performance by prefaulting the pages. However, if that's the case, then that's a more general concern and in particular is workload dependent. Does python actually do this madvise by default?

artemsolod · 2025-10-16T10:26:17Z

@Keno, @oscardssmith thanks for looking into this!

I am testing on a dedicated server running Ubuntu 25.04, kernel 6.14

uname -a
Linux ubuntu-c-8-intel-ams3-01 6.14.0-32-generic #32-Ubuntu SMP PREEMPT_DYNAMIC Fri Aug 29 14:21:26 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux

From my experiments the performance jump happens only when either explicit madvise is called or /sys/kernel/mm/transparent_hugepage/enabled is set to always (by default it's set to madvise). I was first suspecting that using mmap to allocate could be sufficient but this does not seem to work. Here is a test script comparing manual madivse and usual julia memory allocation, it can be run on 1.12 or master.

import Mmap: MADV_HUGEPAGE

function memory_from_mmap(n)
    capacity = n*8
    # PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS  
    ptr = @ccall mmap(C_NULL::Ptr{Cvoid}, capacity::Csize_t, 3::Cint, 34::Cint, (-1)::Cint, 0::Csize_t)::Ptr{Cvoid}
    retcode = @ccall madvise(ptr::Ptr{Cvoid}, capacity::Csize_t, MADV_HUGEPAGE::Cint)::Cint
    iszero(retcode) || @warn "Madvise HUGEPAGE failed"

    ptr_int = convert(Ptr{Int}, ptr)
    mem = unsafe_wrap(Memory{Int}, ptr_int, n; own=false)
end

function f(N; with_mmap=false)
    if with_mmap
        mem = memory_from_mmap(N)
    else
        mem =  Memory{Int}(undef, N)
    end
    mem .= 0
    mem[end]
end

f(1; with_mmap=true)
f(1; with_mmap=false)
N = 10_000_000
GC.enable(false)
@time f(N; with_mmap=true)  # 0.015535 seconds (1 allocation: 32 bytes)
@time f(N; with_mmap=false) # 0.043966 seconds (2 allocations: 76.297 MiB)

With echo always > /sys/kernel/mm/transparent_hugepage/enabled both versions are fast, echo never > /sys/kernel/mm/transparent_hugepage/enabled both are slow. For default one echo madvise > /sys/kernel/mm/transparent_hugepage/enabled performance is very different depending on with_mmap=true .

I've also tried commenting out madvise in this PR branch, this shows the same performance as master, i.e. makes it slower.

As for whether this is done in python:

numpy definitely does it and relies on madvise being activated in the system, the threshold is hardcoded - source and documentation
CPython also has explicit madvise (or rather I see it their mimalloc code source). However, the mechanism is more sophisticated and they mention in comments that they expect it to not be necessary:

      // Many Linux systems don't allow MAP_HUGETLB but they support instead
      // transparent huge pages (THP). Generally, it is not required to call `madvise` with MADV_HUGE
      // though since properly aligned allocations will already use large pages if available
      // in that case -- in particular for our large regions (in `memory.c`).
      // However, some systems only allow THP if called with explicit `madvise`, so
      // when large OS pages are enabled for mimalloc, we call `madvise` anyways.

oscardssmith · 2025-10-16T13:12:51Z

Seems like it's almost a bug that this doesn't just work by default, but 2x perf is 2x perf, so I say we merge this with a note that when linux starts doing the not dumb thing by default we can delete it.

gbaraldi · 2025-10-16T14:29:14Z

I'm confused why glibc isn't doing this, but then again their allocator is middling at best

Madvise Transparent Huge Pages for large allocations

83c86e0

github-actions bot assigned oscardssmith Oct 15, 2025

Avoid calling into unistd from windows

ba492a6

oscardssmith added performance Must go faster arrays [a, r, r, a, y, s] labels Oct 15, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Madvise Transparent Huge Pages for large allocations #59858

Madvise Transparent Huge Pages for large allocations #59858

artemsolod commented Oct 15, 2025

Uh oh!

Keno commented Oct 15, 2025

Uh oh!

oscardssmith commented Oct 16, 2025

Uh oh!

Keno commented Oct 16, 2025

Uh oh!

artemsolod commented Oct 16, 2025

Uh oh!

oscardssmith commented Oct 16, 2025

Uh oh!

gbaraldi commented Oct 16, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Uh oh!

Madvise Transparent Huge Pages for large allocations #59858

Are you sure you want to change the base?

Madvise Transparent Huge Pages for large allocations #59858

Conversation

artemsolod commented Oct 15, 2025

Uh oh!

Keno commented Oct 15, 2025

Uh oh!

oscardssmith commented Oct 16, 2025

Uh oh!

Keno commented Oct 16, 2025

Uh oh!

artemsolod commented Oct 16, 2025

Uh oh!

oscardssmith commented Oct 16, 2025

Uh oh!

gbaraldi commented Oct 16, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants