Skip to content

proxylib Free Is Noticeably Slower Than Direct UMF Pool Under Multithreaded Workloads #1077

@lplewa

Description

@lplewa

In proxylib, every free operation must check whether the pointer being freed belongs to the "leak pool." The leak pool is a workaround for recursive allocations when the malloc function (overridden by proxylib) triggers other call to malloc (often through libraries like hwloc).

This check is performed under a lock, causing threads to synchronize on every free. This results in significant overhead under multithreaded loads. Although #1072 increases the size of the pool to reduce the time spent under this lock, the goal should be to remove the lock entirely.

Two approaches are under consideration:

-Use Atomic Operations Instead of a Mutex
The leak pool consists of multiple smaller pools linked together. When they are all full, a new pool is created. Instead of relying on a lock, we can manage this pool list with atomic compare-and-swap operations.

-Use a Single Large Pool
Rather than maintaining multiple pools, create a large anonymous mmap (with PROT_NONE). If more space is needed, simply change the protection flags for new pages. This removes the need for locking to verify whether a pointer belongs to the pool. On Windows, VirtualAlloc can be used similarly to reserve and commit pages on-demand.

Below is a flame graph illustrating performance after #1072
Image

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions