Skip to content

Reduce address space waste of std.heap.page_allocator on Windows #17413

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
mikastiv opened this issue Oct 6, 2023 · 12 comments
Open

Reduce address space waste of std.heap.page_allocator on Windows #17413

mikastiv opened this issue Oct 6, 2023 · 12 comments
Labels
os-windows standard library This issue involves writing Zig code for the standard library.
Milestone

Comments

@mikastiv
Copy link

mikastiv commented Oct 6, 2023

Currently, the PageAllocator does allocations aligned with std.mem.page_size (4KB on Windows) using VirtualAlloc. Windows requires it's allocations to be aligned on 64KB boundaries, so everytime this allocator is used, up to 60KB of address space is wasted.

I made a small program to illustrate. All the addresses are 64KB apart (except the first one):

const alloc = std.heap.page_allocator;

for (0..16) |_| {
    const block = try alloc.alloc(u8, 4096);
    std.debug.print("{*}\n", .{block});
}

Outputs:

u8@2babf240000
u8@2babf360000
u8@2babf370000
u8@2babf380000
u8@2babf390000
u8@2babf4a0000
u8@2babf4b0000
u8@2babf4c0000
u8@2babf4d0000
u8@2babf4e0000
u8@2babf4f0000
u8@2babf500000
u8@2babf510000
u8@2babf520000
u8@2babf530000
u8@2babf540000

A simple solution would be to use Windows' allocation granularity instead of the page size, although this would commit at least 64KB every allocations instead of 4KB.

@matu3ba
Copy link
Contributor

matu3ba commented Oct 6, 2023

According to https://stackoverflow.com/questions/20023446/is-virtualalloc-alignment-consistent-with-size-of-allocation for context "64KB is the value of SYSTEM_INFO.dwAllocationGranularity". 4 KB is the page size and allocations are only aligned to full page sizes.

It looks like Windows uses memory overcommit only for stack sizes and not for heap allocations, see https://superuser.com/questions/1194263/will-microsoft-windows-10-overcommit-memory.

this would commit at least 64KB every allocations instead of 4KB.

Memory is only commited, if the page is accessed, not before. See also https://stackoverflow.com/questions/20023446/is-virtualalloc-alignment-consistent-with-size-of-allocation.

I made a small program to illustrate. All the addresses are 64KB apart (except the first one):

This does not specify how this is a problem for your use case. Allocators are expected to do batching for performance. Can you justify, why you need control over the default amount of reserved pages?

@expikr
Copy link
Contributor

expikr commented Oct 6, 2023

@notcancername
Copy link
Contributor

This does not specify how this is a problem for your use case. Allocators are expected to do batching for performance. Can you justify, why you need control over the default amount of reserved pages?

It is generally assumed that std.heap.page_allocator allocates at std.mem.page_size granularity (obviously) . Personally, I view this as the problem. std.heap.PageAllocator ought to have a preferred_length comptime field that specifies the ideal length of an allocation in pages.

@mikastiv
Copy link
Author

mikastiv commented Oct 6, 2023

Can you justify, why you need control over the default amount of reserved pages?

No, I was playing around with std.heap.page_allocator and noticed that the allocations were always aligned to 64KB

std.heap.PageAllocator ought to have a preferred_length comptime field that specifies the ideal length of an allocation in pages.

I like this idea

@squeek502
Copy link
Collaborator

@mikastiv why was this closed?

@mikastiv
Copy link
Author

I didn't get the feeling that it needed to be addressed by the answers I got and there hasn't been any posts since last week

@squeek502
Copy link
Collaborator

I think it's worth keeping open if you don't mind. When combined with #17377, it is behavior that will end up mattering since the GeneralPurposeAllocator intentionally tries to avoid re-using virtual addresses and this caveat will make Windows exhaust the virtual address space more quickly.

@squeek502 squeek502 reopened this Oct 12, 2023
@squeek502 squeek502 added standard library This issue involves writing Zig code for the standard library. os-windows labels Oct 12, 2023
@jnordwick
Copy link

jnordwick commented Oct 26, 2023

GeneralPurposeAllocator intentionally tries to avoid re-using virtual addresses

Isn't this going to blow out the TLB. It is very serious issue with some processes with a large working set, especially when huge pages aren't in use.

@squeek502
Copy link
Collaborator

squeek502 commented Oct 26, 2023

Isn't this going to blow out the TLB. It is very serious issue with some processes with a large working set, especially when huge pages aren't in use.

AFAIK it's primarily a strategy intended for catching double frees. When #12484 is addressed, it almost certainly won't be part of the release-mode GeneralPurposeAllocator.

@matu3ba
Copy link
Contributor

matu3ba commented Dec 29, 2023

I think I am running into this when testing dynamic library loads, although at least partially due to Windows SHENNANIGAN of mapping libraries not intialized into memory even though an error occurs on usage of process mitigation.
See matu3ba/win32k-mitigation#1.

I'll have a look tomorrow with tooling. UPDATE: WPA shows me this behavior although I'm very surprised that <4MB virtual memory is the virtual memory limit (used this nice tutorial https://learn.microsoft.com/en-us/cpp/build-insights/tutorials/vcperf-and-wpa?view=msvc-170 https://learn.microsoft.com/en-us/windows-hardware/test/wpt/memory-footprint-optimization-exercise-2) and

reg add "HKLM\Software\Microsoft\Windows NT\CurrentVersion\Image File Execution Options\child_ntdll_only.exe" /v TracingFlags /t REG_DWORD /d 1 /f
reg delete "HKLM\Software\Microsoft\Windows NT\CurrentVersion\Image File Execution Options\child_ntdll_only.exe"

to get ~4MB virtual memory usage

20231230LoadLibrary_virtualalloc_issue

I'll investigate more.

This is the trace of the C program (<0.5MB virtual memory usage):

20231230LoadLibrary_calloc_noissue

Most likely the process or job limit https://learn.microsoft.com/en-us/windows/win32/api/winnt/ns-winnt-jobobject_extended_limit_information or https://learn.microsoft.com/en-us/windows/win32/api/winnt/ns-winnt-jobobject_basic_limit_information is too low.

UPDATE This issue is unrelated due to incorrect usage, abi problem or another problem on my side. The Windows behavior is however still very interesting and makes me question the robustness of the implementation.

@Vexu Vexu added this to the 0.14.0 milestone Mar 26, 2024
@nevakrien
Copy link

nevakrien commented Apr 12, 2024

related issue I am getting weird print statment from this allocator on a caught error that I belive is the page space

C:\Users\Owner\Desktop>oom.exe
Allocating memory until a crash...
Total memory allocated: 26280 megabytes error.Unexpected: GetLastError(1455): The paging file is too small for this operation to complete.

C:\Users\Owner\AppData\Local\Microsoft\WinGet\Packages\zig.zig_Microsoft.Winget.Source_8wekyb3d8bbwe\zig-windows-x86_64-0.11.0\lib\std\os\windows.zig:1560:49: 0x7ff7807a1a50 in VirtualAlloc (oom.exe.obj)
            else => |err| return unexpectedError(err),
                                                ^
C:\Users\Owner\AppData\Local\Microsoft\WinGet\Packages\zig.zig_Microsoft.Winget.Source_8wekyb3d8bbwe\zig-windows-x86_64-0.11.0\lib\std\heap\PageAllocator.zig:24:36: 0x7ff7807a187a in alloc (oom.exe.obj)
        const addr = w.VirtualAlloc(
                                   ^
C:\Users\Owner\AppData\Local\Microsoft\WinGet\Packages\zig.zig_Microsoft.Winget.Source_8wekyb3d8bbwe\zig-windows-x86_64-0.11.0\lib\std\mem\Allocator.zig:225:53: 0x7ff7807a1fad in allocBytesWithAlignment__anon_3940 (oom.exe.obj)
    const byte_ptr = self.rawAlloc(byte_count, log2a(alignment), return_address) orelse return Error.OutOfMemory;
                                                    ^
C:\Users\Owner\AppData\Local\Microsoft\WinGet\Packages\zig.zig_Microsoft.Winget.Source_8wekyb3d8bbwe\zig-windows-x86_64-0.11.0\lib\std\mem\Allocator.zig:105:62: 0x7ff7807a12f6 in create__anon_3182 (oom.exe.obj)
    const ptr: *T = @ptrCast(try self.allocBytesWithAlignment(@alignOf(T), @sizeOf(T), @returnAddress()));
                                                             ^
C:\Users\Owner\Desktop\oom.zig:21:41: 0x7ff7807a105a in main (oom.exe.obj)
        const newNode = allocator.create(Node) catch { //|err| {
                                        ^
C:\Users\Owner\AppData\Local\Microsoft\WinGet\Packages\zig.zig_Microsoft.Winget.Source_8wekyb3d8bbwe\zig-windows-x86_64-0.11.0\lib\std\start.zig:339:65: 0x7ff7807a1741 in WinStartup (oom.exe.obj)
    std.os.windows.kernel32.ExitProcess(initEventLoopAndCallMain());
                                                                ^
???:?:?: 0x7fff5b397343 in ??? (KERNEL32.DLL)
???:?:?: 0x7fff5b6a26b0 in ??? (ntdll.dll)
error.Unexpected: GetLastError(1455): The paging file is too small for this operation to complete.

C:\Users\Owner\AppData\Local\Microsoft\WinGet\Packages\zig.zig_Microsoft.Winget.Source_8wekyb3d8bbwe\zig-windows-x86_64-0.11.0\lib\std\os\windows.zig:1560:49: 0x7ff7807a1a50 in VirtualAlloc (oom.exe.obj)
            else => |err| return unexpectedError(err),
                                                ^
C:\Users\Owner\AppData\Local\Microsoft\WinGet\Packages\zig.zig_Microsoft.Winget.Source_8wekyb3d8bbwe\zig-windows-x86_64-0.11.0\lib\std\heap\PageAllocator.zig:24:36: 0x7ff7807a187a in alloc (oom.exe.obj)
        const addr = w.VirtualAlloc(
                                   ^
C:\Users\Owner\AppData\Local\Microsoft\WinGet\Packages\zig.zig_Microsoft.Winget.Source_8wekyb3d8bbwe\zig-windows-x86_64-0.11.0\lib\std\mem\Allocator.zig:225:53: 0x7ff7807a22dd in allocBytesWithAlignment__anon_3941 (oom.exe.obj)
    const byte_ptr = self.rawAlloc(byte_count, log2a(alignment), return_address) orelse return Error.OutOfMemory;
                                                    ^
C:\Users\Owner\AppData\Local\Microsoft\WinGet\Packages\zig.zig_Microsoft.Winget.Source_8wekyb3d8bbwe\zig-windows-x86_64-0.11.0\lib\std\mem\Allocator.zig:105:62: 0x7ff7807a13b6 in create__anon_3183 (oom.exe.obj)
    const ptr: *T = @ptrCast(try self.allocBytesWithAlignment(@alignOf(T), @sizeOf(T), @returnAddress()));
                                                             ^
C:\Users\Owner\Desktop\oom.zig:24:42: 0x7ff7807a10bf in main (oom.exe.obj)
                leaker = allocator.create(u8) catch {
                                         ^
C:\Users\Owner\AppData\Local\Microsoft\WinGet\Packages\zig.zig_Microsoft.Winget.Source_8wekyb3d8bbwe\zig-windows-x86_64-0.11.0\lib\std\start.zig:339:65: 0x7ff7807a1741 in WinStartup (oom.exe.obj)
    std.os.windows.kernel32.ExitProcess(initEventLoopAndCallMain());
                                                                ^
???:?:?: 0x7fff5b397343 in ??? (KERNEL32.DLL)
???:?:?: 0x7fff5b6a26b0 in ??? (ntdll.dll)
Memory has been released.

C:\Users\Owner\Desktop>

the code is

const std = @import("std");

const Node = struct {
    next: ?*Node,
    data: [1048576 - @sizeOf(?*Node)]u8,
};

comptime {
    std.debug.assert(@sizeOf(Node) == 1048576);
}

pub fn main() !void {
    const allocator = std.heap.page_allocator;
    var head: ?*Node = null;
    var current: ?*Node = null;
    var nodeCount: usize = 0;
    var leaker: ?*u8 = null; //we have to leak since pointers are more than a byte

    std.debug.print("Allocating memory until a crash...\n", .{});
    while (true) {
        const newNode = allocator.create(Node) catch { //|err| {
            //std.debug.print("Failed to create node, system possibly OOM. Error: {}\n", .{ err });
            while (true) {
                leaker = allocator.create(u8) catch {
                    break;
                };
            }
            break;
        };

        newNode.next = null; // Set next pointer to null after allocation

        if (head == null) {
            head = newNode;
        } else {
            current.?.next = newNode;
        }
        current = newNode;
        nodeCount += 1;

        std.debug.print("\rTotal memory allocated: {d} megabytes ", .{nodeCount});
    }

    // Cleanup, freeing all nodes
    while (head) |node| {
        const next = node.next;
        allocator.destroy(node);
        head = next;
    }
    std.debug.print("Memory has been released.\n", .{});
}

@andrewrk
Copy link
Member

andrewrk commented Feb 23, 2025

Original writeup no longer applies but there are still related issue to fix such as

error: error.Unexpected: GetLastError(1455): The paging file is too small for this operation to complete.

C:\Users\CI\actions-runner-2\_work\zig\zig\lib\std\Thread.zig:637:43: 0x7ff6fb7c8b29 in spawn__anon_134827 (zig.exe.obj)
            return windows.unexpectedError(errno);
                                          ^
C:\Users\CI\actions-runner-2\_work\zig\zig\lib\std\Thread.zig:421:32: 0x7ff6fb61df45 in spawn__anon_55132 (zig.exe.obj)
    const impl = try Impl.spawn(config, function, args);
                               ^
C:\Users\CI\actions-runner-2\_work\zig\zig\lib\std\Thread\Pool.zig:58:40: 0x7ff6fb61db02 in init (zig.exe.obj)
        thread.* = try std.Thread.spawn(.{
                                       ^

Seems to happen in low memory conditions.

I believe other than that, remaining issues are covered by:

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
os-windows standard library This issue involves writing Zig code for the standard library.
Projects
None yet
Development

No branches or pull requests

9 participants