Skip to content

Optimize hashmap memory allocation with arena #222

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

icgmilk
Copy link

@icgmilk icgmilk commented Jul 24, 2025

This patch replaces malloc with arena allocation for hashmap_node_t and key strings, while keeping the hashmap_t structure allocated with malloc.

This hybrid approach reduces memory fragmentation for frequently allocated small objects and preserves compatibility with existing rehash functionality.

Performance analysis for out/shecc src/main.c

Using /usr/bin/time -v to benchmark memory usage.

Before: /usr/bin/time -v ./out/shecc ./src/main.c

        Command being timed: "./out/shecc src/main.c"
        User time (seconds): 0.03
        System time (seconds): 0.10
        Percent of CPU this job got: 100%
        Elapsed (wall clock) time (h:mm:ss or m:ss): 0:00.14
        Average shared text size (kbytes): 0
        Average unshared data size (kbytes): 0
        Average stack size (kbytes): 0
        Average total size (kbytes): 0
        Maximum resident set size (kbytes): 296800
        Average resident set size (kbytes): 0
        Major (requiring I/O) page faults: 0
        Minor (reclaiming a frame) page faults: 84566
        Voluntary context switches: 1
        Involuntary context switches: 2
        Swaps: 0
        File system inputs: 0
        File system outputs: 456
        Socket messages sent: 0
        Socket messages received: 0
        Signals delivered: 0
        Page size (bytes): 4096
        Exit status: 0

After: /usr/bin/time -v ./out/shecc ./src/main.c

Command being timed: "./out/shecc src/main.c"
        User time (seconds): 0.04
        System time (seconds): 0.08
        Percent of CPU this job got: 98%
        Elapsed (wall clock) time (h:mm:ss or m:ss): 0:00.13
        Average shared text size (kbytes): 0
        Average unshared data size (kbytes): 0
        Average stack size (kbytes): 0
        Average total size (kbytes): 0
        Maximum resident set size (kbytes): 296480
        Average resident set size (kbytes): 0
        Major (requiring I/O) page faults: 0
        Minor (reclaiming a frame) page faults: 84444
        Voluntary context switches: 1
        Involuntary context switches: 4
        Swaps: 0
        File system inputs: 0
        File system outputs: 448
        Socket messages sent: 0
        Socket messages received: 0
        Signals delivered: 0
        Page size (bytes): 4096
        Exit status: 0

Summary by Bito

This pull request optimizes memory allocation for the hashmap by using arena allocation for small objects, reducing fragmentation while keeping existing functionality intact. The overall structure remains unchanged, and benchmarks show slight improvements in memory usage and execution time.

Replace malloc with arena allocation for 'hashmap_node_t' and
key strings, while keeping the 'hashmap_t' structure allocated
with malloc.

This hybrid approach reduces memory fragmentation for frequently
allocated small objects and preserves compatibility with existing
rehash functionality.
@@ -50,7 +50,7 @@ type_t *TY_int;
/* Arenas */

arena_t *INSN_ARENA;

arena_t *HASHMAP_ARENA;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add comment here to describe which part of hashmap is this arena allocator responsible for.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants