-
Notifications
You must be signed in to change notification settings - Fork 66
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
A CI Test vector #265
A CI Test vector #265
Conversation
bot:retest |
15 similar comments
bot:retest |
bot:retest |
bot:retest |
bot:retest |
bot:retest |
bot:retest |
bot:retest |
bot:retest |
bot:retest |
bot:retest |
bot:retest |
bot:retest |
bot:retest |
bot:retest |
bot:retest |
I've hit a few snags with the CI - some of it infrastructure others bugs in PRRTE. I made a note in the link below of the issues: I'll return to this after break. |
bot:retest |
ec5e5ec
to
0375228
Compare
bot:ibm:retest |
Notes on PRRTE issues encountered so far:
I'm taking a look into these, but currently, this is what is blocking CI testing. The build looks ok. @rhc54 FYI these are some of the PRRTE issues I've encountered so far. They should be reproducible in the virtual cluster environment if you wanted to try to reproduce. |
bot:ibm:retest |
1 similar comment
bot:ibm:retest |
bot:ibm:retest |
Humm there seems to still be a race in the hello world test program. If I run it across 7 nodes about most of the time it will pass, but sometimes it will hang after displaying the output. Last output to the console from the
So it looks like the |
It took some digging but I think I figured out the problem. Symptom:
Reproducer:
Analysis:
Root cause:
Solutions:
My inclination is to go with (1) since that gives us a maximum of ~71 minutes that we hold on to events, which should be ok for the near term. Additionally, I could change the type from |
In working on the bug commented above I ran into an odd race condition on the Stack trace of the
It is difficult to reproduce, but I might look into it a bit more this week. |
Impressive work chasing this down! I'm actually seeing a problem that is likely related to the eviction time issue, so this will help a lot. See my comment on your proposed fix. |
I tracked down the tool hang noted above. The fix is in #319 |
bot:ibm:retest |
34c0cf1
to
8f6fb99
Compare
I'm working on the IOF issue. A few things I've learned/fixed so far
if it is sending stdin up to the In summary - I'm still digging, but have some footprints to follow :) |
I figured out the
You can see that thread A is processing the I'm tinkering with some solution options at the moment. |
bot:ibm:retest |
2 similar comments
bot:ibm:retest |
bot:ibm:retest |
bot:ibm:scale:retest |
1 similar comment
bot:ibm:scale:retest |
bot:ibm:retest |
bot:ibm:pgi:retest |
bot:ibm:scale:retest |
bot:ibm:pgi:retest |
PGI failure is known. I'm going to disable that build for now. Otherwise, we should be good here.
|
Feel free to turn it "on" whenever you judge it ready - much thanks! |
IBM CI is now globally active on PRRTE (GNU and XL compilers). No special label required. Closing this PR since we don't need it anymore. |
No description provided.