Skip to content
This repository was archived by the owner on Feb 5, 2019. It is now read-only.

Update to jemalloc 4.0.3 #14

Closed
wants to merge 402 commits into from
Closed

Conversation

fbernier
Copy link

As discussed in rust-lang/rust#29196. I guess you should just push my branch upstream instead of merging the pull request into dev.

Jason Evans and others added 30 commits October 16, 2014 12:33
The size of the source allocation is known at this point, so reading the
chunk header can be avoided for the small size class fast path. This is
not very useful right now, but it provides a significant performance
boost with an alternate ralloc entry point taking the old size.
use sized deallocation internally for ralloc
Fix variable declaration with no type in the configure script.
This cleans up the fast path a bit more by moving away more code.
* use sized deallocation in iralloct_realign
* iralloc and ixalloc always need the old size, so pass it in from the
  caller where it's often already calculated
It has no use for the arena_t since unlike rallocx it never makes a new
memory allocation. It's just an unused parameter in ixalloc_helper.
Unlike the preceeding attempted fix, this version avoids the potential
for converting an invalid bin index to a size class.
It is possible for the thread's tdata to be NULL late during thread
destruction, so take care not to dereference a NULL pointer in such
cases.
Fix quarantine to actually update tsd when expanding, and to avoid
double initialization (leaking the first quarantine) due to recursive
initialization.

This resolves jemalloc#161.
Reported by Denis Denisov.
Reported by Guilherme Gonçalves.

This resolves jemalloc#166.
This provides in-place expansion of huge allocations when the end of the
allocation is at the end of the sbrk heap. There's already the ability
to extend in-place via recycled chunks but this handles the initial
growth of the heap via repeated vector / string reallocations.

A possible future extension could allow realloc to go from the following:

    | huge allocation | recycled chunks |
                                        ^ dss_end

To a larger allocation built from recycled *and* new chunks:

    |                      huge allocation                      |
                                                                ^ dss_end

Doing that would involve teaching the chunk recycling code to request
new chunks to satisfy the request. The chunk_dss code wouldn't require
any further changes.

    #include <stdlib.h>

    int main(void) {
        size_t chunk = 4 * 1024 * 1024;
        void *ptr = NULL;
        for (size_t size = chunk; size < chunk * 128; size *= 2) {
            ptr = realloc(ptr, size);
            if (!ptr) return 1;
        }
    }

dss:secondary: 0.083s
dss:primary: 0.083s

After:

dss:secondary: 0.083s
dss:primary: 0.003s

The dss heap grows in the upwards direction, so the oldest chunks are at
the low addresses and they are used first. Linux prefers to grow the
mmap heap downwards, so the trick will not work in the *current* mmap
chunk allocator as a huge allocation will only be at the top of the heap
in a contrived case.
Fix OOM cleanup in huge_palloc() to call idalloct() rather than
base_node_dalloc().  This bug is a result of incomplete refactoring, and
has no impact other than leaking memory during OOM.
This eliminates the malloc tunables as tools for an attacker.

Closes jemalloc#173
In addition to true/false, opt.junk can now be either "alloc" or "free",
giving applications the possibility of junking memory only on allocation
or deallocation.

This resolves jemalloc#172.
Currently pprof will print output for all threads if a single thread is not
specified, but this doesn't play well with many output formats (e.g., any of
the dot-based formats).  Instead, default to printing just the overall profile
when no specific thread is requested.

This resolves jemalloc#157.
Dmitry-Me and others added 25 commits September 15, 2015 11:19
Systems that do not support chunk split/merge cannot shrink/grow huge
allocations in place.
Fix ixallocx_prof_sample() to never modify nor create sampled small
allocations.  xallocx() is in general incapable of moving small
allocations, so this fix removes buggy code without loss of generality.
Fix irallocx_prof_sample() to always allocate large regions, even when
alignment is non-zero.
Simplify imallocx_prof_sample() to always operate on usize rather than
sometimes using size.  This avoids redundant usize computations and
more closely fits the style adopted by i[rx]allocx_prof_sample() to fix
sampling bugs.
Fix prof_alloc_rollback() to read tdata from thread-specific data rather
than dereferencing a potentially invalid tctx.
Run integration tests with MALLOC_CONF="prof:true,prof_active:false" in
addition to MALLOC_CONF="prof:true".
Fix prof_tctx_dump_iter() to filter out nodes that were created after
heap profile dumping started.  Prior to this fix, spurious entries with
arbitrary object/byte counts could appear in heap profiles, which
resulted in jeprof inaccuracies or failures.
Zero all trailing bytes of large allocations when
--enable-cache-oblivious configure option is enabled.  This regression
was introduced by 8a03cf0 (Implement
cache index randomization for large allocations.).

Zero trailing bytes of huge allocations when resizing from/to a size
class that is not a multiple of the chunk size.
Make mallocx() OOM testing work correctly even on systems that can
allocate the majority of virtual address space in a single contiguous
region.
Work around a potentially bad thread-specific data initialization
interaction with NPTL (glibc's pthreads implementation).

This resolves jemalloc#283.
In addition to depending on map coalescing, the test depended on
munmap() being disabled so that chunk recycling would always succeed.
@rust-highfive
Copy link

Thanks for the pull request, and welcome! The Rust team is excited to review your changes, and you should hear from @alexcrichton (or someone else) soon.

If any changes to this PR are deemed necessary, please add them as extra commits. This ensures that the reviewer can see what has changed since they last reviewed the code. The way Github handles out-of-date commits, this should also make it reasonably obvious what issues have or haven't been addressed. Large or tricky changes may require several passes of review and changes.

Please see the contribution instructions for more information.

@rust-highfive
Copy link

warning Warning warning

  • Pull requests are usually filed against the master branch for this repo, but this one is against dev. Please double check that you specified the right target!

@alexcrichton
Copy link
Member

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.