-
Notifications
You must be signed in to change notification settings - Fork 2.6k
0.17 - 0.19 segfaults on powerpc64le some of the time, can't reproduce in gdb #4197
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Got a stacktrace from a core dump:
|
Mind taking a look at the faulting instruction as well? That may illuminate what's going on here. |
I can reproduce it in gdb if I enable ASLR. I observed many different segfaults:
etc etc. The first three are given in more detail below:
|
I did not observe this until today, before which I had run cargo many times successfully on the same machine, and it also worked on the build daemons for Debian-experimental. As of today, these segfaults are appearing on these same machines that had run it successfully in the past. I then tried rust's own binary tarball, as well as on previous chroots (of Debian-stable, Debian-testing) that in theory should be the same between today and the previous few days, but the segfaults now apparently also occur there. So I'm thoroughly confused. |
It may be worth trying to narrow this down, I highly doubt the bug is actually in Cargo itself. The faulting instruction from that last dump is
which IBM claims is a storage instruction. In that case we can substitute that as:
The memory address doesn't look misaligned or anything, so looks pretty good to me. It could be that the address is straight up not mapped (this is a bland segfault) perhaps. Either way this sort of smells like an LLVM or an allocator bug? |
Looks like it's a stack overflow, |
Can you try to find out which function is using so much stack? (print the stack pointer for each frame using |
It does not look like the stack is going above 8MB though:
The machine was recently updated with a kernel security fix for DSA 3886 and some other Debian people think the fix for CVE-2017-1000364 might be the culprit here. Disabling ASLR with |
Not sure if I'm reading this correctly, but it looks like the stack is getting set to be zero-sized? From another run:
From
|
How does |
Fresh off oss-security: http://www.openwall.com/lists/oss-security/2017/06/22/12 - possibly relevant |
Thanks, I can reproduce the segfault with the test program on this segfaulting machine, I will file a bug report to Debian. FWIW I stepped through the entire program and the segfault does in fact occur in the memory allocator but for some reason this wasn't properly captured in the stack traces above:
Hopefully this is indeed the correct fix, but it'll take a while to confirm - I'll get back to this ticket when the relevant people apply the relevant patches. |
Although, their test program still segfaults when I disable ASLR or increase the stack size; whereas cargo's segfault goes away. Do you think this is still compatible with the idea that this linux issue is what's causing the cargo segfaults? |
Unfortunately the issue still persists with a (supposedly) patched kernel. The C code in that oss-security link no longer segfaults, and I can no longer avoid the segfaults by disabling ASLR, but using a larger stack still works. (edit: corrected 2017-06-30) |
@arielb1 do you think this is a rust or a cargo issue? looks like the rust issue has more activity- wondering if this can be safely closed? (thanks for filing @infinity0! 👽 👾 🎇 ) |
I doubt there's anything particular to Cargo here. |
This is a duplicate of rust-lang/rust#43052 |
Should be fixed for the next release yes, closing. |
With current stable:
Similar sort of thing with Debian 0.17, reported here
The text was updated successfully, but these errors were encountered: