-
Notifications
You must be signed in to change notification settings - Fork 12.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Abort on some large allocation requests, Panic on other #26951
Comments
For the record, your code is the equivalent of fn main() { let v = ::std::vec::from_elem(0u16, !0); } In the |
Additional info: already |
/cc @rust-lang/compiler , I think? |
This is a @rust-lang/libs problem. I am of the opinion that aborts should only occur on problems that unwinding can't address. That is, if you overflow the capacity we have to check for that, and we can panic right away. Everything will be cleaned up and the world will be happy. However if an allocation makes it past our checks and then fails in the allocator, we have historically conservatively assumed that you are legitimately OOM and abort; unwinding can lead to arbitrary other allocations which is presumably bad to do during OOM. In practice, the platforms we support basically can't OOM naturally (by which I mean, you're asking for the last few bits of free memory), so I reckon if you ever managed to trigger an OOM it's because you managed to request All The Memory, in which case there's plenty of slack space for unwinding. If it's a legit OOM then we'll double panic and crash anyway. However #14674 mentions that aborting on OOM enables LLVM to better optimize around allocations. I don't know the details. Edit: Regardless, I do not want to add any kind of "heuristic checks" for "will probably OOM" or "was probably a real OOM". |
I think that, ultimately, abort vs. unwind behaviour should be left up to the implementation, but the standard library should abort on failed allocations. This particular example is just a red herring. It's only because |
Also to be completely clear to those not super familiar with the issue: the "odd" behaviour described in this issue is completely by design, and not a mistake. Rather this issue is positing that the behaviour should be changed to be more consistent. |
Of course. I don't think it's completely by design, it's patched up to be this way. For example, if we know a capacity of |
@bluss any time we detect this, I believe we panic. Only We simply don't even check in some cases where we know any degeneracies will be caught by the allocator (e.g. when doubling capacity on 64-bit). |
Note that checking for |
That’s what they said about 32 bits timestamps and addressing too. |
@nagisa Today this is simply a hard hardware issue: you only get 48 bits of the address space. |
Yes, but it is always useful to be aware of future prospects. |
@nagisa I have thankfully architected the current design to make such an upgrade trivial. In fact, all you have to do is delete some code! https://github.com/rust-lang/rust/blob/master/src/liballoc/raw_vec.rs#L445-L453 |
So is this not a bug? |
It's not an accident, at least. On Tue, Aug 4, 2015 at 2:54 PM, Steve Klabnik notifications@github.com
|
I think all conversations I've had about this have said that performance improvement from aborting instead of panic are unlikely; it saves code size in the binary with likely very little to no runtime impact. We don't have numbers. So I reported this because I'd like to use |
I think the @rust-lang/libs team just needs to make a final call on this. I think @bluss and I have both made our stances fairly clear, but to summarize (hopefully I'm not misrepresenting bluss): Always Abort:
Sometimes Abort:
Never Abort (wild card!):
|
Code size may have an impact (from unrelated Servo discussion) 1:
|
triage: P-medium As an action item the libs team has decided to see if it's possible to panic on OOM and we can possibly reevaluate from there. |
Would be interesting to check is a singleton pointer value is enough to unwind without allocating. In the case of true OOM, it's possible that secondary allocations during unwinding (which should be rare, I can't recall ever seeing an allocating destructor) could succeed due to deallocations earlier in the unwinding process. |
@eddyb I think our main concern is libunwind allocating. A quick look seems to reveal they never check their mallocs. However maybe there's some macro shenanigans or maybe this is all utility code that isn't called by us. I didn't really dig deep into it (also maybe I checked out the wrong version...). |
In particular if Rust allocates during OOM that should be fine -- we always check and a double panic is just an abort. |
Java pre-allocates an OutOfMemory exception at initialization to ensure it can properly throw when out of memory - it might make sense to do the same thing here if possible. |
IIRC Python does something similar as well. Sent from my iPhone
|
Preallocating might be difficult given that Rust code might not be called from inside a Rust main. I guess we could use pre-allocation for when it is called from a Rust program, and then fallback to something else for the called by foreign code case. |
cc me |
@Aatch the problem of preallocation seems pointless, since we control what the pointer value means, and if it points to some |
Note that this is a memory safety issue for a few types: Box::new can't panic today -- this would introduce more exception-safety nonsense. |
I don't really have an opinion on how the problem is solved, I would just recommend fixing the illegal instruction error soonest. I had bad code that was incorrectly calculating a vector length that only kicked in when I tried to run the release mode without arguments. Having the program crash repeatedly with "Illegal Instruction" had me incorrectly assuming the rust compiler was fubar and generating bad code. Least I'm assuming it's the same issue with Rust 1.2 on a 64 bit machine calling something like: |
Please don't panic on OOM, especially now that catch_panic is safe. |
@pixel27 I imagine a good fix would be a better way to report the issue. Maybe an output message before the abort (which I believe is an issue we have filed). |
@pythonesque Why? If it's not possible to deal with OOM in Rust, I can't use Rust. See https://internals.rust-lang.org/t/could-we-support-unwinding-from-oom-at-least-for-collections/3673/21 for an exploration of why handling OOM is important. |
@nathanaeljones IMO the real problem here is that we don't have an allocator abstraction in |
@nathanaeljones the default, std allocator aborts, but given Rust's low-level nature, you could replace it with one that doesn't. It is a lot more work right now, but as @eddyb said, this is something that will come along in time. |
@steveklabnik Thanks; I'm not too concerned with 'lots of work', but rather if I can produce well-behaved binaries for both FFI and server use. I started porting Imageflow over to Rust in June, but failed to verify that OOM panics (as widely reported) instead of aborting. I'd assumed that the stabilization of It's very unclear to me where the anti-OOM-panic sentiment arises from. Double panics abort. I've been trying to track down blockers around this, but haven't been able to discover the reasons why movement on graceful OOM handling has been slow. I think I ?might? have found the right issue to discuss: #27700 (comment) |
Triage: still blocked on a libs team decision. |
Looks like there is now an unstable Mentioned in that issue, there is also an accepted RFC to set a cargo attribute for And another issue to make the default for |
We discussed this in the Libs API meeting today: the consensus was that we should pipe "invalid layout" errors through to |
Compare these two:
We request a vector with 18446744073709551615 elements.
u8
we receive out of memory (null) from the allocator, and callalloc::oom
, which aborts the program:application terminated abnormally with signal 4 (Illegal instruction)
u16
, we get an assertion:thread '<main>' panicked at 'capacity overflow', ../src/libcore/option.rs:330
This is inconsistent. Why don't we abort on both of these cases? We abort on too large allocation requests, so those that are even larger could abort too?
The text was updated successfully, but these errors were encountered: