This roadmap documents the migration of meniOS userland memory allocator from a first-fit freelist design to a buddy allocator system.
✅ MILESTONE COMPLETE: Achieved in v0.1.666 (2025-10-20). All 20 buddy allocator issues complete (100%). The production-ready buddy allocator provides robust memory management for complex userland applications including Doom, with comprehensive security hardening and performance optimizations.
The current first-fit allocator works but has limitations:
- External fragmentation: Arbitrary-sized blocks can create unusable gaps
- Coalescing complexity: Must scan neighbors to merge free blocks
- Performance: Linear search through freelist for suitable blocks
- No power-of-2 optimization: Can't leverage alignment tricks
Buddy allocator advantages:
- Bounded fragmentation: Power-of-2 sizes limit worst-case fragmentation
- Fast coalescing: Buddy address calculated via XOR, no search needed
- Order-segregated lists: O(1) lookup for appropriate size class
- Alignment guarantees: Blocks naturally aligned to their size
- Proven design: Used in Linux kernel, FreeBSD, and many production systems
The migration follows a careful sequence to minimize risk:
- Survey → Understand current implementation
- Design → Define buddy parameters and data structures
- Infrastructure → Rewrite arena setup
- Core Logic → Implement split/coalesce
- Integration → Connect to malloc/free API
- Extensions → Handle realloc and large allocations
- Validation → Comprehensive testing
- Cleanup → Remove old code and document
| Issue | Title | Status | Priority | Effort |
|---|---|---|---|---|
| #245 | Survey Current Heap Implementation | ✅ Complete | High | 2-3 days |
| #246 | Define Buddy Allocator Orders and Configuration | ✅ Complete | High | 1-2 days |
| #247 | Rewrite Arena Setup for Buddy Allocator | ✅ Complete | High | 3-4 days |
| #248 | Implement Buddy Split and Coalesce Operations | ✅ Complete | Critical | 4-5 days |
| #249 | Integrate Buddy Allocator with malloc/free | ✅ Complete | Critical | 3-4 days |
| #250 | Adapt realloc/reallocarray for Buddy Allocator | ✅ Complete | High | 2-3 days |
| #251 | Update Direct mmap Path for Large Allocations | ✅ Complete | Medium | 2 days |
| #252 | Add Buddy Allocator Diagnostics and Tests | ✅ Complete | High | 3-4 days |
| #253 | Cleanup and Document Buddy Allocator Migration | ✅ Complete | Medium | 2-3 days |
Total Estimated Effort: 22-30 days
#245 (Survey)
↓
#246 (Define Orders)
↓
#247 (Arena Setup)
↓
#248 (Split/Coalesce) ──→ #252 (Tests)
↓ ↓
#249 (malloc/free) ────────→ #252 (Tests)
↓ ↓
#250 (realloc) #253 (Cleanup)
#251 (Direct mmap)
A buddy allocator manages memory in power-of-2 sized blocks organized by "order":
- Order k → Block size = 2^k bytes
- Each order maintains a freelist of available blocks
- Allocation: Find smallest order that fits, split larger blocks if needed
- Deallocation: Return block to freelist, coalesce with buddy if free
For a block at address addr with size 2^k:
buddy_addr = addr ^ (1 << k)This XOR trick works because:
- Blocks at order k are aligned to 2^k boundaries
- Buddy pairs differ only in bit k
- XOR flips bit k to find the partner
Example (order 12 = 4096 bytes):
Block A: 0x100000 (bit 12 = 0)
Block B: 0x101000 (bit 12 = 1) ← buddy
When no block available at order k:
- Find block at order k+1
- Split into two buddies at order k
- Return one buddy, add other to freelist[k]
- Recurse if needed
When freeing block at order k:
- Calculate buddy address
- Check if buddy is free at same order
- If yes: merge into order k+1, recurse
- If no: add block to freelist[k]
#define MIN_ORDER 5 // 32 bytes (fits header + small data)
#define MAX_ORDER 22 // 4 MiB (reasonable arena size)
#define ARENA_SIZE (1 << MAX_ORDER) // 4 MiB
#define NUM_ORDERS (MAX_ORDER - MIN_ORDER + 1) // 18 freeliststypedef struct block_header block_header_t;
/* block_header_t carries both payload metadata and the buddy freelist hooks */typedef struct arena {
void *base; // Start address
size_t size; // Total arena size
block_header_t *freelists[NUM_ORDERS]; // Per-order freelists
struct arena *next; // Arena chain
} arena_t;Goal: Complete understanding of current allocator
Deliverables:
grow_heapcurrently mmaps the next arena (starting at 1 MiB, doubling up to 32 MiB) and seeds a singleblock_header_tthat spans the arena. The block is pushed onto the global freelist and linked intoarena_list_head.block_header_tis 0x50 bytes, aligned to 16, and stores neighbour pointers (next/prev), freelist linkage (free_next/free_prev), the owning arena pointer (NULL for directmmap), the mmap base/size for direct mappings, payload size in bytes, and flags (BLOCK_FLAG_FREE,BLOCK_FLAG_DIRECT). Payload begins immediately after the header.- Allocation is pure first-fit:
find_suitable_blocklinearly walksfree_list_head,split_blockcarves the tail of oversized blocks, andmallocdoes not segregate by size. Freeing coalesces with adjacent free blocks inside the same arena and pushes the merged block back to the freelist. Requests aboveARENA_MAX_SIZE - arena_overheador with large alignments fall back to directmmapand bypass arena bookkeeping. - Risk assessment: arbitrary splitting/coalescing coupled with a single freelist makes size accounting fragile (pattern mismatches observed in
/bin/malloc_stress). Fragmentation grows quickly under heavy workloads, alignment relies on header maths, and large allocations bypass arenas entirely—motivating the switch to a deterministic buddy scheme.
Goal: Define buddy allocator parameters
Deliverables:
- Order bounds. Adopt
MIN_ORDER = 7(128 B blocks) andMAX_ORDER = 27(128 MiB blocks).MIN_ORDERsafely covers the 0x50-byte header plus 16-byte alignment;MAX_ORDERmatches the new 128 MiB arena size. This yields 21 freelists (orders 7–27). - Arena size. Each arena will map
1u << MAX_ORDERbytes (128 MiB), aligned to 2 MiB for paging. New arenas are seeded as a single order-27 block before splitting. - Buddy metadata.
The existing 0x50-byte header now doubles as the buddy freelist node, so allocated blocks keep their metadata for debugging while free blocks reuse the same storage for
typedef struct block_header { struct block_header* buddy_next; struct block_header* buddy_prev; struct arena* arena; // Owning arena; NULL for direct mmaps uint32_t buddy_order; // 7..27 inclusive uint32_t buddy_flags; // BUDDY_FREE, BUDDY_USED uintptr_t buddy_offset; // Offset from arena->buddy_base /* ... existing payload fields (mapping_base, size, flags, etc.) ... */ } block_header_t;
buddy_next/prev. - Arena descriptor.
typedef struct arena { void* base; size_t size; // always 128 MiB struct arena* next; block_header_t* freelists[21]; // orders 7..27 } arena_t;
- Order/size table (excerpt).
Order Block size 7 128 B 8 256 B 9 512 B … … 20 1 MiB 21 2 MiB 27 128 MiB - Design notes. Orders ≤ 20 (≤ 1 MiB) cover the bulk of libc allocations. Larger orders handle gcc/Doom workloads without immediately falling back to direct
mmap. Direct mappings still handle requests exceeding order 27 or alignments beyond the buddy range.
Goal: Bootstrap buddy system
Key Changes:
- Replace the single global freelist with per-order freelists stored in each arena (and optionally a global array for quick lookup). Provide helpers such as
buddy_push(order, block)/buddy_pop(order)so allocation code no longer touches the legacy list. - When
grow_heapmmaps a 128 MiB arena, initialise anarena_tstructure (base, size,freelists[21], link intoarena_list_head). Seed the arena by materialising an order-27block_header_tat offset 0 inside the buddy payload region. - Ensure the seeded block records
order = MAX_ORDER,flags = BUDDY_FREE, andarena = current arena. Defer splitting to the allocation path that consumes blocks fromfreelists. - Remove the old
free_list_headusage in favour of order-aware insertion/removal. Existing arena metadata (base pointer, size) becomes part of the newarena_tso free/coalesce can locate the owning freelist quickly.
Goal: Implement split and coalesce
Critical Functions:
block_header_t* buddy_split(block_header_t *block, int target_order);
block_header_t* buddy_coalesce(block_header_t *block);
void* buddy_addr(void *block, int order); // Calculate buddy addressStatus: ✅ Completed. Arenas now maintain per-order buddy freelists, buddy_split_to_order splits large blocks down to a requested order while seeding right-side siblings, and buddy_coalesce_block merges a freed block back up through the hierarchy. Host-only regression tests (test/test_buddy_allocator.c) exercise both operations to ensure the order bookkeeping and freelist accounting stay consistent across splits and merges. Each arena still mmaps 128 MiB, and the allocator will keep reserving additional arenas on demand—effectively unbounded until the process exhausts VM regions or the kernel runs out of physical memory.
Goal: Connect to malloc/free
Changes:
void* malloc(size_t size) {
int order = compute_order(size + sizeof(block_header_t));
block_header_t *block = allocate_from_order(order);
if (!block && order <= MAX_ORDER) {
block = buddy_split(higher_order_block, order);
}
if (!block) {
// Fall back to direct mmap
}
return (void*)(block + 1);
}
void free(void *ptr) {
block_header_t *block = (block_header_t*)ptr - 1;
if (block->flags & DIRECT_MMAP) {
munmap(block, block->size);
} else {
mark_free(block);
buddy_coalesce(block);
}
}Goal: Complete allocator features
realloc (#250): ✅ Done
- Buddy-managed blocks now split when shrinking and attempt in-place expansion by merging the right-hand buddy before falling back to copy+free.
reallocarrayrides the same path, so overflow checks feed the buddy allocator automatically.
Direct mmap (#251): ✅ Done
- Requests larger than the buddy ceiling or requiring alignments above 16 bytes now funnel through a single helper that over-allocates, aligns, and tracks the mapping so
freecanmunmapcorrectly. posix_memalign,memalign,valloc, andpvallocall ride this path, so unusual alignments no longer depend on the buddy freelists.
Goal: Validate correctness and performance
Test Coverage:
-
Added host regression
test/test_malloc_direct.cvalidating both oversizedmallocand high-alignmentposix_memalignpaths to ensure the shared direct-mmap helper remains correct. -
Introduced
menios_malloc_stats()alongside host testtest/test_malloc_stats.cso we can assert arena/direct counters without manual inspection. -
Split/coalesce unit tests
-
Alignment validation (posix_memalign)
-
Fragmentation stress tests
-
Large allocation scenarios
-
Update malloc_stress for buddy patterns
-
Performance comparison vs first-fit
Goal: Production-ready code
Status: ✅ Completed. Legacy first-fit code paths have been removed, the buddy design is documented, and regression coverage (test/test_buddy_allocator.c, test/test_malloc_direct.c, test/test_malloc_stats.c) guards allocator correctness, large direct mappings, and diagnostics.
The migration is complete when:
- All buddy allocator issues (#245-#253) closed
- All existing malloc/free/realloc tests pass
- malloc_stress runs without errors under heavy load
- No memory leaks detected
- Fragmentation within acceptable bounds
- Performance meets or exceeds first-fit baseline
- Code fully documented with design rationale
- Architecture docs updated
Why it needs Buddy Allocator:
- Native compilation requires robust memory management
- Compiler/linker tools have complex allocation patterns
- Buddy allocator reduces fragmentation for long-running builds
Why it needs Buddy Allocator:
- Game engines stress memory allocator heavily
- Frequent allocation/deallocation of various sizes
- Buddy system's O(1) operations critical for real-time performance
- Alignment guarantees important for graphics/audio buffers
Once buddy allocator is stable, consider:
-
Slab Allocator Layer
- Fast path for common fixed sizes (16, 32, 64, 128 bytes)
- Reduce buddy overhead for small allocations
- Common in kernel memory management
-
Per-CPU Arenas
- Reduce lock contention in threaded environments
- Each CPU gets private arena
- Lock-free fast path
-
NUMA-Aware Allocation
- Allocate from memory local to CPU
- Important for multi-socket systems
- Future-proofing for SMP support
-
Memory Compaction
- Move allocations to reduce fragmentation
- Requires moving GC or cooperation from applications
- Advanced feature for later
- Linux Kernel Buddy Allocator:
mm/page_alloc.c - FreeBSD UMA: Universal Memory Allocator design
- Classic Paper: "The Buddy System" by Kenneth Knowlton (1965)
- Modern Analysis: "Dynamic Storage Allocation: A Survey and Critical Review" by Wilson et al.
Last Updated: 2025-10-20 Status: ✅ MILESTONE COMPLETE - 20/20 issues (100%) Current Release: v0.1.666 "DOOM READY" includes production-ready buddy allocator Achievement: Complete migration from first-fit to buddy allocator with comprehensive testing and security hardening Milestone: Buddy Allocator See Also:
- Road to Shell - ✅ COMPLETE
- Road to GCC - 🚀 Phase 4 Ready
- Road to Doom - ✅ COMPLETE