Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use correct module for errors_found_in_log #3119

Merged
merged 4 commits into from
Feb 19, 2020

Conversation

Flamefire
Copy link
Contributor

The (important) warning message was never printed due to a wrong module being used after a rename/move some time ago

test/framework/utilities.py Outdated Show resolved Hide resolved
test/framework/utilities.py Outdated Show resolved Hide resolved
test/framework/utilities.py Outdated Show resolved Hide resolved
ocaisa
ocaisa previously approved these changes Dec 10, 2019
Copy link
Member

@ocaisa ocaisa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For me this looks good, I will take a look tomorrow to see how hard it would be to come up a test

@ocaisa
Copy link
Member

ocaisa commented Dec 10, 2019

For me this looks good, I will take a look tomorrow to see how hard it would be to come up a test

Actually the test is there, this was really just incorrect import/usage.

@Flamefire
Copy link
Contributor Author

Flamefire commented Dec 10, 2019

Hmm, haven't found one by grepping for errors_found_in_log . Probably best to check the log anyway

@boegel boegel added this to the release after 4.1.0 (4.1.1?) milestone Dec 10, 2019
@boegel
Copy link
Member

boegel commented Dec 10, 2019

Nice catch @Flamefire...

We indeed overlooked this errors_found_in_log global variable when moving parse_log_for_error from easybuild.tools.filetools to easybuild.tools.run a (very) long time ago (see #842).

I'm a bit reluctant to re-instate the warning like this though, since the pattern used by parse_log_for_error for error seems very generic to me now that I take a closer look at it...

It basically considers any occurrence of error or failed as a clear indication that something went wrong when running a command, while we know now that this often leads to many false positives.

This is from the CMake log file for example, after checking the output of ./configure ...:

== 2019-11-03 23:36:42,996 run.py:589 INFO parse_log_for_error (some may be harmless) regExp (?<![(,-]|\w)(?:error|segmentation fault|failed)(?![(,-]|\.?\w) found:
-- Performing Test CMAKE_HAVE_LIBC_PTHREAD - Failed                                                                                                                                                         -- Check size of __int64 - failed
-- Check size of unsigned __int64 - failed                                                                                                                                                                  -- Performing Test HAVE_WORKING_EXT2_IOC_GETFLAGS - Failed
-- Performing Test HAVE_STRUCT_VFSCONF - Failed                                                                                                                                                             -- Performing Test HAVE_STRUCT_XVFSCONF - Failed
-- Performing Test MAJOR_IN_MKDEV - Failed                                                                                                                                                                  -- Performing Test HAVE_LZMA_STREAM_ENCODER_MT - Failed
-- Performing Test HAVE_STRUCT_TM___TM_GMTOFF - Failed                                                                                                                                                      -- Performing Test HAVE_STRUCT_STATFS_F_NAMEMAX - Failed
-- Performing Test HAVE_STRUCT_STAT_ST_BIRTHTIME - Failed                                                                                                                                                   -- Performing Test HAVE_STRUCT_STAT_ST_BIRTHTIMESPEC_TV_NSEC - Failed
-- Performing Test HAVE_STRUCT_STAT_ST_MTIMESPEC_TV_NSEC - Failed                                                                                                                                           -- Performing Test HAVE_STRUCT_STAT_ST_MTIME_N - Failed
-- Performing Test HAVE_STRUCT_STAT_ST_UMTIME - Failed                                                                                                                                                      -- Performing Test HAVE_STRUCT_STAT_ST_MTIME_USEC - Failed
-- Performing Test HAVE_STRUCT_STAT_ST_FLAGS - Failed                                                                                                                                                       -- Performing Test HAVE_STRUCT_STATVFS_F_IOSIZE - Failed

So, I'm not sure that re-instating this without doing a better job at filtering out false positives (which can be hard to distinguish from actual problems in practice)...

@Flamefire
Copy link
Contributor Author

Found another bug in the code, so it seems the log message is not covered by any test.

I'm a bit reluctant to re-instate the warning like this though, since the pattern used by parse_log_for_error for error seems very generic to me now that I take a closer look at it...

It basically considers any occurrence of error or failed as a clear indication that something went wrong when running a command, while we know now that this often leads to many false positives.

Then my proposal would be:

@Flamefire
Copy link
Contributor Author

Together with #3118 I could use it on our system to install a big package chain and sort out the most common false positives. I expect most of them to be the same, so should be easy to reduce the amount of false-positives considerably

ocaisa
ocaisa previously approved these changes Dec 12, 2019
Copy link
Member

@ocaisa ocaisa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm fine with this but since there's a big connection to #3118 I'll leave it to @boegel to merge

@boegel
Copy link
Member

boegel commented Jan 14, 2020

Reducing the false positives is nice, but I still feel that producing a big fat warning for every occurrence of something that looks like an error but may not be is not a good idea.

In some cases you may not even be able to automatically distinguish false positives from actual errors...

How about changing the print_warning to log.warning, to just log in the log file rather than spit it out in the output of the eb command?

@Flamefire
Copy link
Contributor Author

Sure. This PR was just meant to fix a regression and not question the initial intention of that feature. I'm all for replacing this by something more advanced.

On the other hand: Providing this warnings to someone testing a EC as a "warning" could still be useful. Checking for false-positives can be done by the user and common false-positives can be suppressed by EB (see my other PR). I mean: In the end this is a warning that something might be wrong. We can't know if this is a false-positive or not. For the second case it would be wrong not to show it.

@boegel
Copy link
Member

boegel commented Jan 14, 2020

I understand your point, but I think there will be a lot more false positives than genuine errors that go totally undetected through other means (like a failing sanity check).

We certainly have had errors that went undetected for a while, but I'd say they are rare, and so I'm reluctant to start raising concerns by highlighting false positives...

@Flamefire
Copy link
Contributor Author

Ok so _log.warning instead

smoors
smoors previously approved these changes Feb 17, 2020
Copy link
Contributor

@smoors smoors left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

Copy link
Contributor

@smoors smoors left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@smoors smoors merged commit e453884 into easybuilders:develop Feb 19, 2020
@Flamefire Flamefire deleted the fix_error_display branch February 19, 2020 16:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants