Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Peers disconnecting with UNKNOWN_EXCEPTION #4300

Closed
eigentsmis opened this issue Jun 14, 2020 · 13 comments
Closed

Peers disconnecting with UNKNOWN_EXCEPTION #4300

eigentsmis opened this issue Jun 14, 2020 · 13 comments

Comments

@eigentsmis
Copy link

eigentsmis commented Jun 14, 2020

TESTING UPDATE: PLEASE SEE MY LAST COMMENT IN THIS THREAD

After 15-20min from opening the Bisq app on OSX 10.12.6 I see very long roundtrip times (see attached snippet) and also most of the peers disconnected with UNKNOWN_EXCEPTION (see attached log)

Did a lot of testing and my conclusion so far is that it's directly related to the "Turn display off after" setting in the Energy Saver window on my Mac. It seems that once the display goes off it also puts the system in some sort of idle/sleep mode and I believe this impacts Bisq's network connectivity.

Two ways which I've so far tested successfully as a workaround to this issue are:

  1. Set the "Turn display off after" setting to "Never" OR
  2. Add the "-d" option to the existing caffeinate command which Bisq issues upon starting.
    (Example: /usr/bin/caffeinate -w 4137 -d)

huge_roundtrip_times

UNKNOWN_EXCEPTION.log

@dmos62
Copy link
Contributor

dmos62 commented Jun 14, 2020

You experience these connection issues if you start Bisq after the display had turned itself off, right? If the display turns itself off (for the first time) while a Bisq instance is running (without connection issues until this point), does the connection degrade immediately? Does it ever recover without rebooting your Mac?

@eigentsmis
Copy link
Author

eigentsmis commented Jun 14, 2020

You experience these connection issues if you start Bisq after the display had turned itself off, right? If the display turns itself off (for the first time) while a Bisq instance is running (without connection issues until this point), does the connection degrade immediately? Does it ever recover without rebooting your Mac?

The steps are:

  1. Open Bisq (with display on and set for 5min to go in standby if no activity)
  2. Wait about 20-30min
  3. Login back to computer and see that Bisq connections are either very low in number or have very high roundtrip times. Also, number of offers present goes way down (most of the time to no offers)
  4. After waiting a while and not letting the display go off again I do see that Bisq is trying to recover itself and I see an improvement in roundtrip times but it seems only some offers show up and not all of them.
  5. To see all offers again a Bisq restart is required.

@eigentsmis
Copy link
Author

BIG TESTING UPDATE:
The display going to sleep and causing this issue seems to be a byproduct of the following root issue:

  1. Open Bisq
  2. Minimize the window in the Mac Dock
  3. Wait 30-60min
  4. Reactivate window
  5. Notice the huge roundtrip times and that most offers are gone and a bunch of peer disconnects due to UNKNOWN_EXCEPTION are seen in the log.

On the other hand, if the Bisq app is opened fresh and left as the active window on the screen overnight, everything is fine, no errors in the log, all peers still connected ok and all offers visible.

From what I can tell this seems to be an issue with JavaFX.
Somehow, if the Bisq window is not active, it causes networking degradation in the app.

@dmos62
Copy link
Contributor

dmos62 commented Jun 22, 2020

Not sure I can explain this from the parts I'm familiar with. @sqrrm, @ripcurlx I think this deserves to be on the critical bugs board.

@sqrrm
Copy link
Member

sqrrm commented Jun 23, 2020

It used to be a problem if sleep mode was activated and that's why we had a sound file looping to avoid resources not being allocated to bisq. Did we remove this sound playing feature? If that's the case it might well be the reason for these dropped messages.

@eigentsmis
Copy link
Author

I believe the sounds file looping has been replaced with caffeinate, if supported.
I can confirm that caffeinate is running but this seems to be a different issue.

After more testing it seems the Bisq window doesn't even need to be minimized, it just needs to be NOT the active window in order to reproduce this issue within max 60min.

@sqrrm
Copy link
Member

sqrrm commented Jun 24, 2020

Sounds a lot like the issue that the sound loop was handling though, maybe there is some other resource prioritization going on that caffeinate doesn't handle but the sound loop did.

@eigentsmis
Copy link
Author

So it looks like when building Bisq locally and starting with ./bisq-desktop the issue doesn't happen anymore for me.
Could it be that starting Bisq from the terminal tells OSX the process has a higher priority than when starting normally and thus should not be deallocating any resources from it?

@chimp1984
Copy link
Contributor

I just observed the same issue. Had Bisq binary on OSX running and after display was on sleep for a while the Bisq app have very few offers, and started to fill up offer book again. So the caffeinate solution does not prevent it from getting throttled by the OS.

From the description (https://www.unix.com/man-page/osx/8/caffeinate/) I am not sure of caffeinate is really doing what we want. We do NOT want that Bisq prevents system or display sleep, but just that Bisq will not sent to hibernate. The sound loop trick did that. To me it seems caffeinate is the wrong tool, and maybe there is none on OSX.

@dmos62 Are you familiar with caffeinate? The dev (@christophsturm) who introduced it is not active in Bisq anymore.

I think this is a very important issue as OSX users who have offers online will either not get their offers published, or if one takes the offer there is a high chance of timeout or connection issues which can lead to failed trades.
I will make a PR to revert that to the old sound file solution until we are not sure that the alternative solution works. Seems nobody tested this change on OSX when it was deployed ;-(.

chimp1984 added a commit to chimp1984/bisq that referenced this issue Aug 9, 2020
It seems caffeinate is not preventing that Bisq gets throttled resources
once the OS switches to hibernate.
See:
bisq-network#4300 (comment)
@chimp1984
Copy link
Contributor

As @eigentsmis pointed out it can be that the behaviour of the binary and the java app started from source code is different. Both version need to work reliably.

@dmos62
Copy link
Contributor

dmos62 commented Aug 9, 2020 via email

@stale
Copy link

stale bot commented Nov 7, 2020

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@stale stale bot added the was:dropped label Nov 7, 2020
@stale
Copy link

stale bot commented Nov 15, 2020

This issue has been automatically closed because of inactivity. Feel free to reopen it if you think it is still relevant.

@stale stale bot closed this as completed Nov 15, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants