Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ofThread : short lived threads can cause uncaught exceptions #5262

Open
armadillu opened this issue Sep 13, 2016 · 8 comments
Open

ofThread : short lived threads can cause uncaught exceptions #5262

armadillu opened this issue Sep 13, 2016 · 8 comments

Comments

@armadillu
Copy link
Contributor

armadillu commented Sep 13, 2016

I am getting sporadic crashes on very short lived threads; This code is enough to reproduce the issue; at about one crash per minute. I am seeing this happen on OS X 10.11.

class MyThread : public ofThread{
public:
    void threadedFunction(){}
};

void ofApp::update(){
    MyThread* t = new MyThread();
    t->startThread();
}

It seems that every now and then the thread fails to detach() and throws an exception.

Wrapping the call in a try/catch keeps the app running.

try{
    thread.detach();
}catch(...){
    ofLogFatalError("ofThreadErrorLogger::exception") << "failed to detach!";
}
@arturoc
Copy link
Member

arturoc commented Sep 13, 2016

can you check which kind of exception is throwing? a debugger should tell you that

@armadillu
Copy link
Contributor Author

thread::detach failed: Invalid argument

@armadillu
Copy link
Contributor Author

i tried checking if its joinable() b4 detaching first, but it happens too.

@arturoc
Copy link
Member

arturoc commented Sep 13, 2016

yes it's better to check the exception since the thread could be joinable when checking but become not joinable after. can you send a PR?

arturoc added a commit that referenced this issue Sep 20, 2016
Fix for #5262 - Short lived threads can cause exceptions
@armadillu
Copy link
Contributor Author

fixed in #5263

@armadillu
Copy link
Contributor Author

armadillu commented Aug 8, 2017

I am going to reopen this; I finally found some time to really look into it and it still is an issue.

When you launch several short-lived ofThreads too suddenly (i.e. spawn a few threads in a single frame), sometimes the thread gets in a weird state.

I have an example demonstrating the issue here. I have seen the issue both on OSX and windows.

If you run the code above, you will see printouts like this:

[ error ] ofApp: ofThread is done but we will crash if we try to delete - LEAKING

every few seconds. This is as far as I got with this, basically the thread object is done (and it executed fine) but the std::thread itself is in a weird state, so if you try to delete it it leads to crash.
You can modify the USE_OF_THREAD macro and set it to false to see how it doesn't happen with an alternative implementation.

I think it is related to the way the ofThread is launched with :

thread = std::thread(std::bind(&ofThread::run,this));

interacting with the detach()call in weird ways - as you can see here but I am unsure of what the difference really is.

This is a pretty big stability problem for one of my addons; I worked around the problem by creating a simplified ofThread-like class that works 100% ( left it running for weeks ). You can see this alternate ofThread class here.

@arturoc
Copy link
Member

arturoc commented Aug 9, 2017

if you change thread = std::thread(std::bind(&ofThread::run,this)); to the way you do it then it doesn't crash?

also opening a lot of threads in a short period of time doesn't sound like a good idea, since it's a really expensive operation, not saying we shouldn't fix this because of that, this is clearly a bug, but if you need to use several threads in an application you probably want to create a pool of threads and send tasks to then instead or simply use std::async which should use a pool of threads behind the scenes

@armadillu
Copy link
Contributor Author

armadillu commented Aug 9, 2017

No, changing the example to thread = std::thread(std::bind(&ofThread::run,this)); doesn't make it crash.

Playing a bit more with the example, the cause seems to be the detach() from within the thread.

The original reason for this issue caused the app to crash bc detach() sometimes fails on very short threads. Catching the exception around detach() just hides away the issue temporarily, as when the detach() fails, the thread remains "bugged" and can't be deleted (I haven't found a way to do so without a crash). You can at least detect when this happened by checking the if the thread ID is valid (see here).

The issue seems partly caused because detach() is called from within the thread itself; if I call detach right away from the main thread after creating it, the problem disappears that's the main difference in my alternative ofThread implementation here).

I've been searching for docs on deatch() to confirm if its allowed to use it from within the thread, or some advice on where should it be called from, but I found nothing about it – although all the examples I come across seem to use detach() from the main thread.

It seems everything is easier if you to know what you will do with the thread before you spawn it (and I mean "what will you do" in terms of cleanup strategy; will you join it or will you detach it?). We added detach() in the first place (#2506) bc if the user doesn't explicitly waitForThread() on the ofThread object from the main thread, ofThread ends up leaking resources and those add up leading to an inability to spawn threads. Detaching is currently almost the very last thing the ofThread does. We can not really delay it any further, and if your threadedFunction() is very short lived, the issue will to happen from time to time if we detach() blindly as we currently do.

My current solution only works for me because I know beforehand that I will not be joining the thread I'm spawning; thus I can afford to detach it safely from the main thread as soon as I spawn it.

As it seems detaching from the main thread rises no issue, I wonder if an ofThread API change enforcing the user to spawn the thread explicitly stating a cleanup strategy would be to ask too much for the user - because doing so would solve the problem.

I could see something along the lines of ofThread::startThread(bool willJoinLater); if we know what the user expects to do from the thread (i.e. no plans to join it) then we can detach it right away from the main thread from startThread() (as the user guarantees he/she will not join the thread).

In the other case, if the user starts a thread telling ofThread that he/she plans to join the thread, we can issue detect a broken promise and issue warning if the threadedFunction() exits and the waitForThread() method hasn't been called (i.e. the user forgot to join the thread), reminding the user about the resource leak. Either way I think we should not automatically detach() from the thread itself as it can fail under partly unclear circumstances (how short-lived is too short for a thread to fail detach()?).

This may be a lot to ask for the users - but I believe this will also make the users more aware about the internals of threads and general app architecture, which I would argue is beneficial. I think adding an example demonstrating the two cases would also make things much more clear.

We should probably leave the current API as it is with unchanged behaviour for backwards compatibility, and add two alternative spawn methods i.e. ofThread::startDetachThread() and ofThread::startJoinLaterThread() for more advanced users who have a clear idea on what their thread will do.

PD: Yes the overhead on this example is ludicrous but I merely built it to be able to reproduce the issue. I have a centralised queue that limits on how many requests (i.e. threads) can happen per frame, but because I also have other threads spawning across other areas of the app (and the modules aren't really aware of each other), the issue still arises from time to time when a few spawn too "close" to each other.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants