-
Notifications
You must be signed in to change notification settings - Fork 28
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
MAINT: Make tests less flaky #168
Conversation
Codecov Report
@@ Coverage Diff @@
## main #168 +/- ##
==========================================
+ Coverage 72.27% 73.32% +1.04%
==========================================
Files 29 29
Lines 3012 3033 +21
Branches 671 673 +2
==========================================
+ Hits 2177 2224 +47
+ Misses 736 714 -22
+ Partials 99 95 -4
Flags with carried forward coverage won't be shown. Click here to find out more.
|
Okay now the tests fail with at least more verbose information at the MNE-lsl and liblsl levels now, e.g., this one where there is a segfault:
@mscheltienne do you want to look now that there is more information available? There is some logging stuff in there that is suspicious to my untrained eye like the "Stream transmission broke off". Locally running with stuff like For now I added FYI each test now uses its own |
This part we can confirm because we hack in the increasing sample numbers as an EEG channel, and find that for example 70 samples in we get a 17-sample jump when we should get only 1-sample jumps.
This could be the culprit. Looking at https://github.com/mne-tools/mne-lsl/actions/runs/6865120608/job/18668396515?pr=168#step:9:1217 for example we see:
Any ideas on what could cause this part? |
Let's see if we can summon @tstenner in here who I know has worked on the "Stream transmission broke off" problem in the past. I've encountered it too but I never looked into the origin. |
Just pulling what I can from memory... If you see this error repeated over-and-over then this is probably a protocol mismatch between the outlet and the inlet. There were a couple funky versions of liblsl that caused this. But you're only seeing this message once in a while so I don't think that's it. Here is where that particular error is being generated. Unfortunately we have 2 functions that generate the same error string. Again, this seems like a protocol problem. I would only expect that in the middle of normal transmission if there was an actual network problem, like the interface went down for a moment and there was an incomplete transmission. |
We are using and limiting
Anyway, closing this for today 🙈 Back at it tomorrow ;) |
@larsoner , after 4487567 the run only had a couple segfaults. For a while, we had the same problem: segfaults in macos only, and only on GitHub CI -- we couldn't reproduce locally. This kind of went away on its own with some other changes. We still have segfaults, but it doesn't seem to be platform specific. They mostly happen during liblsl\testing\ext\bench_pushpull.cpp. This one is a little weird. In one of its parameterizations, it creates a single outlet and quickly creates up to 10 inlets for that outlet. Then it pushes some samples to the outlet and pulls them in each inlet. I hope this was helpful somehow. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For a while, we had the same problem: segfaults in macos only, and only on GitHub CI -- we couldn't reproduce locally.
That seems to be the case for us, too! Good to know we're not alone. I managed to stabilize things at least somewhat. Got one green run, then re-ran CIs and at least don't seem to get segfaults anymore. I think the big gain probably came from using a block size of 64 vs 16... CIs tend to be slower at everything so I was thinking that would give everything a bit more time to process by reducing call overhead. But who knows...
@mscheltienne I suggest we merge this as-is and investigate the bad_gh_macos
separately later. In the meantime this improves stuff a lot.
@@ -3,12 +3,14 @@ | |||
import time | |||
from ctypes import byref, c_char_p, c_double, c_int, c_size_t, c_void_p | |||
from functools import reduce | |||
from threading import Lock |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
... and I added threading.Lock
to help prevent segfaults.
@property | ||
def _obj(self): | ||
if self.__obj is None: | ||
raise RuntimeError("The StreamInlet has been destroyed.") | ||
return self.__obj | ||
|
||
@_obj.setter | ||
def _obj(self, obj): | ||
self.__obj = obj | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Plus some protection against errantly using self._obj
after it has been destroyed
mne_lsl/utils/_tests.py
Outdated
# On macOS we get differences like -0.030293 vs -0.03031, -0.030286 vs -0.030313 | ||
atol = 0.0001 if platform.system() == "Darwin" else 0.0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't love this but it seems necessary 🤷
bad_gh_macos = pytest.mark.skipif( | ||
platform.system() == "Darwin" and os.getenv("GITHUB_ACTIONS", "") == "true", | ||
reason="Unreliable on macOS CI", | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
... and I had to add this skip for the drop_channels test. Looking at the last few failures, other than the segfaults most were on this test and only on macOS.
Thanks a lot @larsoner; first green run in a long long time 🙈 The same One more CI rerun; still one failure on windows. The surprising part is that this |
Happy to merge as is and to keep digging for the remaining points separately, that's already a huge improvements. |
This reverts commit 8ce0263.
So far I'm happy with the CI output at least. We should probably make the LSL level configurable using an env var. Then we can have a default config for devs (maybe less verbose?) and set the verbosity in CIs however we want. But I'll merge this as-is and feel free to tweak those in a follow-up PR! |
The three big causes are handshake failures (mainly weird decimal point issues for some europeans) before the first sample, the outlet getting closed before the input and the outlet disappearing before the inlet. The MacOS problems happened (as Chadwick pointed out) mostly in a synthetic benchmark, but only on the CI systems and never on one of my local machines. If anyone manages to get a stack trace or dump, I could investigate this further. |
Closes #166
mne._fiff
import (must check 1.6 before checking 1.5)Player
use in fixtureslib
calls