Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Test failure of EB 2.7.0 #1685

Closed
jhein32 opened this issue Mar 21, 2016 · 31 comments
Closed

Test failure of EB 2.7.0 #1685

jhein32 opened this issue Mar 21, 2016 · 31 comments
Milestone

Comments

@jhein32
Copy link

jhein32 commented Mar 21, 2016

HI Kenneth,

Following your message I upgraded our cluster to EB 2.7.0 and got a few failures. If I understand the errors correctly, this one is strange. If I do eb --list-toolchains and eb --avail-module-naming-schemes on the command line, the commands behave "normal".

Here is a screenshot from the test:

screen shot 2016-03-21 at 18 35 08

Thanks for looking.

Best wishes
Joachim

@boegel
Copy link
Member

boegel commented Mar 21, 2016

@jhein32 There's a new requirement in the tests, i.e. that eb is available through $PATH (that wasn't needed before to make the tests pass).

However, since you're loading the EasyBuild module, you have eb available through $PATH in this session, right?

Do you have anything particular in your .bashrc that may be getting in the way? These tests use run_cmd to run eb, which is done in a subshell...

@boegel boegel added this to the v2.8.0 milestone Mar 21, 2016
@jhein32
Copy link
Author

jhein32 commented Mar 21, 2016

On 21 Mar 2016, at 23:08, Kenneth Hoste <notifications@github.commailto:notifications@github.com> wrote:

@jhein32https://github.com/jhein32 There's a new requirement in the tests, i.e. that eb is available through $PATH (that wasn't needed before to make the tests pass).

However, since you're loading the EasyBuild module, you have eb available through $PATH in this session, right?

eb was available. I could issue the commands at the prompt and they succeeded.

Do you have anything particular in your .bashrc that may be getting in the way? These tests use run_cmd to run eb, which is done in a subshell...

I don’t think there is anything particular in my .bashrc or similar. Are these the only tests running the eb command via “run_cmd”?

Joachim


You are receiving this because you were mentioned.
Reply to this email directly or view it on GitHubhttps://github.com//issues/1685#issuecomment-199509927

@boegel
Copy link
Member

boegel commented Mar 25, 2016

@jhein32 Yes, only these two tests require having eb available in $PATH...

I think the problem is related to the use of module purge by the tests, which is fixed in #1702.

cc @pforai

@boegel
Copy link
Member

boegel commented Mar 28, 2016

@jhein32 Please try applying the simple patch from https://github.com/hpcugent/easybuild-framework/pull/1702/files, which should resolve the problem with eb when running the tests.

@boegel boegel closed this as completed Mar 28, 2016
@jhein32
Copy link
Author

jhein32 commented Apr 12, 2016

Hi,

I finally managed to test this (sorry that it took time) and, unless I made a mistake, this does not seem to resolve the issue :(

I hand edited the file utilities.py in the directory
software/Core/EasyBuild/2.7.0/lib/python2.7/site-packages/easybuild_framework-2.7.0-py2.7.egg/test/framework
of our EasyBuild installation. I preceded lines 174 & 175 of that file with a #. I deleted the matching pyc and pyo files (the pyc has since be regenerated). The test gives exactly the same results as before.

Please let me know how you like to proceed.

Best wishes
Joachim

@jhein32
Copy link
Author

jhein32 commented Apr 12, 2016

Hi Kenneth,

I finally tried it but it doesn’t appear to fix the issue. I added info to the ticket on github, in the hope that would reopen the issue, but seem not. Please let me know if you’d like to continue working on this.

Best wishes
Joachim

On 28 Mar 2016, at 19:56, Kenneth Hoste <notifications@github.commailto:notifications@github.com> wrote:

@jhein32https://github.com/jhein32 Please try applying the simple patch from https://github.com/hpcugent/easybuild-framework/pull/1702/files, which should resolve the problem with eb when running the tests.


You are receiving this because you were mentioned.
Reply to this email directly or view it on GitHubhttps://github.com//issues/1685#issuecomment-202508120

@boegel
Copy link
Member

boegel commented Apr 12, 2016

@jhein32 Hmm, that's weird... Can you make sure you edited the correct file, by running this:

python -c "import test.framework.utilities; print test.framework.utilities.__file__"

To retest, you can also just run the options subsuite of tests, using python -O -m test.framework.options

@boegel boegel reopened this Apr 12, 2016
@boegel
Copy link
Member

boegel commented May 10, 2016

@jhein32 any updates on this?

@boegel boegel modified the milestones: v2.9.0, v2.8.0 May 10, 2016
@jhein32
Copy link
Author

jhein32 commented May 11, 2016

@boegel

Hi Kenneth,

I replied by email on 12 April (did you get this?), that your command indicates it is using the updated file. Not sure how to take this further. Should I just wait for EB 2.8.0 being released and check the issue is still present?

Best wishes
J.

@boegel
Copy link
Member

boegel commented May 11, 2016

@jhein32 seems like I didn't get that message, indeed... (it also doesn't appear here in the issue)

If eb is available in $PATH when you're running the tests, and you patched the right test/framework/utilities.py file, I'm not sure why the tests would still fail...

Can you copy-paste the exact series of commands you're executing? Maybe run python -m test.framework.options to run the partial test suite instead (should be the same result, and it'll be a lot faster).
And also run grep purge <prefix>/test/framework/utilities.py, where you replace <prefix> according to your setup, of course (it shouldn't show any hits on 'purge').

Please also see if you still have the problem when using python -O -m test.framework.options, but that shouldn't make a difference either.

@jhein32
Copy link
Author

jhein32 commented May 20, 2016

Hi Kenneth,
Just a sign of life. I just upgraded our installation to EB 2.8.0 and the issue is still there. I try to get the above tests done early next week.

Best wishes
Joachim

@boegel
Copy link
Member

boegel commented May 20, 2016

@jhein32 OK, that's weird, because I tried to reproduce this on a system where EasyBuild is only available via a module file, and I couldn't get it to fail like it does for you.

So, I'd love to get more info on this, when you can find the time.

@jhein32
Copy link
Author

jhein32 commented May 24, 2016

Hi Kenneth

I did a fresh login. The commands I executed and the output I get is as follows:

-bash-4.2$ module load lmod/6.0.24 
-bash-4.2$ export TEST_EASYBUILD_MODULES_TOOL=Lmod
-bash-4.2$ module load EasyBuild/2.8.0 
-bash-4.2$ python -m test.framework.options
INFO: This is (based on) vsc.install.shared_setup 0.10.2
...................Skipping test_from_pr, no GitHub token available?
.Skipping test_from_pr, no GitHub token available?
.........EE......Skipping test_new_pr, no GitHub token available?
.......Skipping test_review_pr, no GitHub token available?
...............
======================================================================
ERROR: test_include_module_naming_schemes (__main__.CommandLineOptionsTest)
Test --include-module-naming-schemes.
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/sw/easybuild/software/Core/EasyBuild/2.8.0/lib/python2.7/site-packages/easybuild_framework-2.8.0-py2.7.egg/test/framework/options.py", line 1914, in test_include_module_naming_schemes
    logtxt, _= run_cmd("cd %s; eb %s" % (self.test_prefix, ' '.join(args)), simple=False)
  File "/sw/easybuild/software/Core/EasyBuild/2.8.0/lib/python2.7/site-packages/easybuild_framework-2.8.0-py2.7.egg/easybuild/tools/run.py", line 149, in run_cmd
    return parse_cmd_output(cmd, stdouterr, ec, simple, log_all, log_ok, regexp)
  File "/sw/easybuild/software/Core/EasyBuild/2.8.0/lib/python2.7/site-packages/easybuild_framework-2.8.0-py2.7.egg/easybuild/tools/run.py", line 397, in parse_cmd_output
    raise EasyBuildError('cmd "%s" exited with exitcode %s and output:\n%s', cmd, ec, stdouterr)
EasyBuildError: 'cmd "cd /tmp/eb-uVWywG/eb-NcCemZ; eb --avail-module-naming-schemes" exited with exitcode 127 and output:\n/bin/bash: eb: command not found\n'

======================================================================
ERROR: test_include_toolchains (__main__.CommandLineOptionsTest)
Test --include-toolchains.
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/sw/easybuild/software/Core/EasyBuild/2.8.0/lib/python2.7/site-packages/easybuild_framework-2.8.0-py2.7.egg/test/framework/options.py", line 1988, in test_include_toolchains
    logtxt, _= run_cmd("cd %s; eb %s" % (self.test_prefix, ' '.join(args)), simple=False)
  File "/sw/easybuild/software/Core/EasyBuild/2.8.0/lib/python2.7/site-packages/easybuild_framework-2.8.0-py2.7.egg/easybuild/tools/run.py", line 149, in run_cmd
    return parse_cmd_output(cmd, stdouterr, ec, simple, log_all, log_ok, regexp)
  File "/sw/easybuild/software/Core/EasyBuild/2.8.0/lib/python2.7/site-packages/easybuild_framework-2.8.0-py2.7.egg/easybuild/tools/run.py", line 397, in parse_cmd_output
    raise EasyBuildError('cmd "%s" exited with exitcode %s and output:\n%s', cmd, ec, stdouterr)
EasyBuildError: 'cmd "cd /tmp/eb-uVWywG/eb-T7_slN; eb --list-toolchains" exited with exitcode 127 and output:\n/bin/bash: eb: command not found\n'

----------------------------------------------------------------------
Ran 59 tests in 273.168s

FAILED (errors=2)

@jhein32
Copy link
Author

jhein32 commented May 24, 2016

As you suggest, on the grep I get nothing:

-bash-4.2$ grep purge /sw/easybuild/software/Core/EasyBuild/2.8.0/lib/python2.7/site-packages/easybuild_framework-2.8.0-py2.7.egg/test/framework/utilities.py
-bash-4.2$ 

@jhein32
Copy link
Author

jhein32 commented May 24, 2016

So your third question - I get:

-bash-4.2$ python -O -m test.framework.options
INFO: This is (based on) vsc.install.shared_setup 0.10.2
...................Skipping test_from_pr, no GitHub token available?
.Skipping test_from_pr, no GitHub token available?
.........EE......Skipping test_new_pr, no GitHub token available?
.......Skipping test_review_pr, no GitHub token available?
...............
======================================================================
ERROR: test_include_module_naming_schemes (__main__.CommandLineOptionsTest)
Test --include-module-naming-schemes.
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/sw/easybuild/software/Core/EasyBuild/2.8.0/lib/python2.7/site-packages/easybuild_framework-2.8.0-py2.7.egg/test/framework/options.py", line 1914, in test_include_module_naming_schemes
    logtxt, _= run_cmd("cd %s; eb %s" % (self.test_prefix, ' '.join(args)), simple=False)
  File "/sw/easybuild/software/Core/EasyBuild/2.8.0/lib/python2.7/site-packages/easybuild_framework-2.8.0-py2.7.egg/easybuild/tools/run.py", line 149, in run_cmd
    return parse_cmd_output(cmd, stdouterr, ec, simple, log_all, log_ok, regexp)
  File "/sw/easybuild/software/Core/EasyBuild/2.8.0/lib/python2.7/site-packages/easybuild_framework-2.8.0-py2.7.egg/easybuild/tools/run.py", line 397, in parse_cmd_output
    raise EasyBuildError('cmd "%s" exited with exitcode %s and output:\n%s', cmd, ec, stdouterr)
EasyBuildError: 'cmd "cd /tmp/eb-Bk3Wql/eb-4gllXq; eb --avail-module-naming-schemes" exited with exitcode 127 and output:\n/bin/bash: eb: command not found\n'

======================================================================
ERROR: test_include_toolchains (__main__.CommandLineOptionsTest)
Test --include-toolchains.
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/sw/easybuild/software/Core/EasyBuild/2.8.0/lib/python2.7/site-packages/easybuild_framework-2.8.0-py2.7.egg/test/framework/options.py", line 1988, in test_include_toolchains
    logtxt, _= run_cmd("cd %s; eb %s" % (self.test_prefix, ' '.join(args)), simple=False)
  File "/sw/easybuild/software/Core/EasyBuild/2.8.0/lib/python2.7/site-packages/easybuild_framework-2.8.0-py2.7.egg/easybuild/tools/run.py", line 149, in run_cmd
    return parse_cmd_output(cmd, stdouterr, ec, simple, log_all, log_ok, regexp)
  File "/sw/easybuild/software/Core/EasyBuild/2.8.0/lib/python2.7/site-packages/easybuild_framework-2.8.0-py2.7.egg/easybuild/tools/run.py", line 397, in parse_cmd_output
    raise EasyBuildError('cmd "%s" exited with exitcode %s and output:\n%s', cmd, ec, stdouterr)
EasyBuildError: 'cmd "cd /tmp/eb-Bk3Wql/eb-6ae_T4; eb --list-toolchains" exited with exitcode 127 and output:\n/bin/bash: eb: command not found\n'

----------------------------------------------------------------------
Ran 59 tests in 195.952s

FAILED (errors=2)

I checked my environment for the presence of the eb command and get:

-bash-4.2$ which eb
/sw/easybuild/software/Core/EasyBuild/2.8.0/bin/eb

which is what I expect it to be.

@jhein32
Copy link
Author

jhein32 commented May 24, 2016

If it is import, the machine is centos7

@jhein32
Copy link
Author

jhein32 commented May 24, 2016

One more thing: even if I do not load the lmod module the test still fails.

@boegel
Copy link
Member

boegel commented May 26, 2016

@jhein32 Thank you for all the details... The only thing I can think of is that you have the EasyBuild module installed in a module hierarchy, i.e. as Core/EasyBuild/2.8.0, but I don't see why that would matter...

I'll try and see if I can reproduce this. Do you mind sharing the configuration in place when you install/bootstrap EasyBuild (e.g. env | grep EASYBUILD)? Maybe there's something else that I'm overlooking...

@boegel
Copy link
Member

boegel commented May 26, 2016

OK, so it has nothing to do with the module being in Core:

$ python bootstrap_eb.py $PWD
$ module use  /home/kehoste/modules/all/Core
$ module load EasyBuild/2.8.0
$ module list
Currently Loaded Modulefiles:
  1) EasyBuild/2.8.0
$ which eb
~/software/Core/EasyBuild/2.8.0/bin/eb

$ python -O -m test.framework.options
INFO: This is (based on) vsc.install.shared_setup 0.10.6
...................Skipping test_from_pr, no GitHub token available?
.Skipping test_from_pr, no GitHub token available?
.................Skipping test_new_pr, no GitHub token available?
.......Skipping test_review_pr, no GitHub token available?
...............
----------------------------------------------------------------------
Ran 59 tests in 70.810s

OK

Anything special in your .bashrc that fiddles with $PATH or eb? Any aliases or something?

@fgeorgatos
Copy link
Collaborator

ie. in other words: try it in another account with totally clean .bashrc
and report outcome!

On 26 May 2016 at 20:04, Kenneth Hoste notifications@github.com wrote:

OK, so it has nothing to do with the module being in Core:

$ python bootstrap_eb.py $PWD
$ module use /home/kehoste/modules/all/Core
$ module load EasyBuild/2.8.0
$ module list
Currently Loaded Modulefiles:

  1. EasyBuild/2.8.0
    $ which eb
    ~/software/Core/EasyBuild/2.8.0/bin/eb

$ python -O -m test.framework.options
INFO: This is (based on) vsc.install.shared_setup 0.10.6
...................Skipping test_from_pr, no GitHub token available?
.Skipping test_from_pr, no GitHub token available?
.................Skipping test_new_pr, no GitHub token available?
.......Skipping test_review_pr, no GitHub token available?

...............

Ran 59 tests in 70.810s

OK

Anything special in your .bashrc that fiddles with $PATH or eb? Any
aliases or something?


You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub
#1685 (comment)

echo "sysadmin know better bash than english"|sed s/min/mins/
| sed 's/better bash/bash better/' # signal detected in a CERN forum

@jhein32
Copy link
Author

jhein32 commented May 27, 2016

The account doesn't have a .bashrc

It is a shared account, which we use for building software.
I typically su into it, which sometimes messes things up.

Is there code we could insert in the spirit of printf debugging? If so, please send "git-free" instructions, e.g. copy paste.

@boegel
Copy link
Member

boegel commented May 27, 2016

@jhein32: I'm a bit flabbergasted, not sure what is going on here...

I'll look into a bash script & accompanying Python file that hooks into the test framework for you to run, which will spit out a bunch of information that may help in pinpointing the problem.

I need to think about what info to collect though, since I'm sort of running out of ideas here... :)

@boegel
Copy link
Member

boegel commented May 27, 2016

@jhein32: please try source the bash script provided in https://gist.github.com/boegel/e672acd9f3ae0f3b4963cc1e326e4ef6, after copying the Python script provided in the same gist to test/framework/issue1685.py; see comments at the top of both scripts, and provide us the output (preferably in a gist, see https://gist.github.com)

@boegel
Copy link
Member

boegel commented Jun 2, 2016

@jhein32 Thank you for providing the output from the script off-issue.

I'm now also able to reproduce your problem... I'm not exactly sure yet what's going on, but I'll figure it out, and keep you posted.

@boegel
Copy link
Member

boegel commented Jun 2, 2016

OK, finally got this figured out...

The problem occurs because you're installing EasyBuild in a hierarchical module naming scheme (e.g. Core/EasyBuild/2.8.0), and using Lmod as a modules tool.

During the setup of each tests, EasyBuild takes control over $MODULEPATH to isolate the tests from any modules you may have on your system. It does this via reset_modulepath (see test/framework/utilities.py) by running module unuse on each entry in the current $MODULEPATH.

When it runs module unuse <prefix>Core/, Lmod notices that the EasyBuild module is no longer available via $MODULEPATH, and marks it as inactive (which means unloading it).

That explains why eb is no longer available in the tests...
So, while the test setup is doing what it's supposed to do, it has a side-effect of breaking these two tests.

I'm not sure yet how to fix this, but it's only a problem with the tests themselves, not with your EasyBuild installation.

@jhein32 does this help for now?

@jhein32
Copy link
Author

jhein32 commented Jun 2, 2016

Ah, being able to reproduce the issue is something. Please keep me posted.

Best wishes
Joachim

On 02 Jun 2016, at 13:38, Kenneth Hoste <notifications@github.commailto:notifications@github.com> wrote:

@jhein32https://github.com/jhein32 Thank you for providing the output from the script off-issue.

I'm now also able to reproduce your problem... I'm not exactly sure yet what's going on, but I'll figure it out, and keep you posted.


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHubhttps://github.com//issues/1685#issuecomment-223266905, or mute the threadhttps://github.com/notifications/unsubscribe/APK9dNUW9Qs8hS2twFar_PoH0ZDVKt6cks5qHsCbgaJpZM4H1Y5S.

@jhein32
Copy link
Author

jhein32 commented Jun 7, 2016

Hi Kenneth,

Didn't notice your second response first time round. For sure am I ok with the current situation. In particular since knowing there is no real issue.

I am still trying to get my head around things. What I remember from installing EB, is that Lmod and EB "naturally" go into their own modules. As far as I remember having hierarchical modules is the "non-default" we have. Should we have done things differently? I am not sure the is could be changed easily now, since the service is in production.

Thanks for your time on this.

Best wishes
Joachim

@boegel
Copy link
Member

boegel commented Jun 7, 2016

Going with hierarchical modules or not is a decision you shouldn't take lightly, but it should work; the people at JSC (@ocaisa, @damianam) are using a setup like that in production, and seem quite happy with it. At HPC-UGent we're still using a flat module tree, but once we have Lmod in production, we might reconsider that. The good thing about using EasyBuild is that it will at some point allow to maintain two (or more) module trees (that are potentially very different) side-by-side, with very little effort.

Bugs do pop up sometimes (like this one), but also not related to hierarchical modules.
We do what we can to resolve these issues properly, and put tests in place to avoid that they are reintroduced again.

I will try and get this minor issue resolved by the next release, but I need to think about the right way to tackle it.

It'll probably involve making sure that we record the location where EasyBuild is available ($PATH and $PYTHONPATH), such that unloading the EasyBuild module does not affect the tests...

@boegel
Copy link
Member

boegel commented Jun 13, 2016

should be fixed with #1806

@jhein32
Copy link
Author

jhein32 commented Jun 30, 2016

Hi,
I hacked the changes in #1806 into our EB 2.8.1

python -O -m test.framework.options

does not produce errors. I assume this can now be closed.

@boegel
Copy link
Member

boegel commented Jul 4, 2016

@jhein32 great, thanks for the feedback!

@boegel boegel closed this as completed Jul 4, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants