-
Notifications
You must be signed in to change notification settings - Fork 75
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
RHOAIENG-18979: chore(test/containers): check workbench startup logs #886
RHOAIENG-18979: chore(test/containers): check workbench startup logs #886
Conversation
Skipping CI for Draft Pull Request. |
for keyword in blocked_keywords: | ||
if keyword in line: | ||
if not any(allowed in line for allowed in allowed_messages): | ||
logging.debug(f"Unexpected keyword '{keyword}' in the following message: '{line}'") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
logging.info? or maybe logging.warning?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you are right, I updated it to error
actually
|
||
# let's check that logs don't contain any error or unexpected warning message | ||
with subtests.test("Checking the log in the workbench..."): | ||
failed_lines = 0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have three ideas to avoid counting lines ourselves
- when there is a failure, do a
with subtests.test(): assert ...
, orwith subtests.test(): pytest.fail(...)
, that is, create a subtest for every failed test and immediately fail it - there is https://github.com/okken/pytest-check
- and there is https://github.com/astraw38/pytest-assume
don't really have a preference, this counter is not exactly a bad solution either
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, I kept this as one test since we probably don't want different number of subtests depending on how many log messages are there 🤔 I chose to alternate this part of code based on your other suggestion.
for keyword in blocked_keywords: | ||
if keyword in line: | ||
if not any(allowed in line for allowed in allowed_messages): | ||
logging.debug(f"Unexpected keyword '{keyword}' in the following message: '{line}'") | ||
failed_lines += 1 | ||
# Let's go on another line now as we don't care if multiple blocked keywords are present in this line | ||
break |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
just this small idea that saves one level of indentation and comment is no longer necessary imo
for keyword in blocked_keywords: | |
if keyword in line: | |
if not any(allowed in line for allowed in allowed_messages): | |
logging.debug(f"Unexpected keyword '{keyword}' in the following message: '{line}'") | |
failed_lines += 1 | |
# Let's go on another line now as we don't care if multiple blocked keywords are present in this line | |
break | |
if any(keyword in line for keyword in blocked_keywords): | |
if not any(allowed in line for allowed in allowed_messages): | |
logging.debug(f"Unexpected keyword '{keyword}' in the following message: '{line}'") | |
failed_lines += 1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This has a disadvantage as I am not able to print the keyword that caused this to fail... but yeah, I'll probably amend that this way.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This has a disadvantage as I am not able to print the keyword that caused this to fail...
you could run this twice maybe, when you have wrong line, do this because it's quick and easy
line = "error info warning"
blocked_keywords = ["error", "warning"]
print("failures:", {keyword: keyword in line for keyword in blocked_keywords})
>>> failures: {'error': True, 'warning': True}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
that would be too verbose... let's not complicate it for now, we can always improve
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[Checking the log in the workbench...] SUBFAIL tests/containers/workbenches/workbench_image_test.py::TestWorkbenchImage::test_image_entrypoint_starts[ghcr.io/opendatahub-io/notebooks/workbench-images:jupyter-trustyai-ubi9-python-3.11-889_merge_ad1bfa2899f33cd9962985d35165c6b6b3b5caf3-sysctls1] - Failed: Log message(s) (3) that violate our checks occurred during the workbench startup:
[W 2025-02-06 08:26:45.534 LabApp] Failed to instantiate the extension manager pypi. Falling back to read-only manager.
Traceback (most recent call last):
TypeError: AsyncClient.__init__() got an unexpected keyword argument 'proxies'
This is good output, I like it. From https://github.com/opendatahub-io/notebooks/actions/runs/13174582770/job/36771027197?pr=889#step:25:151
skip_if_not_workbench_image(image) | ||
|
||
# Here is a list of blocked keywords we don't want to see in the log messages during the container/workbench |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe put all this into a helper function (logs and subtests object can be a parameter to it) and also call it from the ipv6 only test below?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
that is a good point, didn't think about that - updated per your suggestion now
|
||
# let's check that logs don't contain any error or unexpected warning message | ||
with subtests.test("Checking the log in the workbench..."): | ||
failed_lines = 0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
on second thought, maybe I'd make this
failed_lines = 0 | |
failed_lines: list[str] = [] |
and .append to it the failed lines
why should people hunt for them in the log when they can be kept here and produced in the final failure message
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yup, I like it - updated
96ccd47
to
bba8309
Compare
finally: | ||
# try to grab logs regardless of whether container started or not | ||
# TODO - do we want to enforce couple of seconds wait to be sure nothing extra suspicious is logged for the workbench??? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
either that, or possibly we may even want to try to use jupyterlab a bit
every playwright test that we're going to eventually create could be checking logs like this, even, maybe
future work
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
minor nit - but i thought we were told we needed to write cypress tests (?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
jupyterlab and codeserver upstreams use playwright
elyra and odh dashboard (and notebooks 2.00 use cypress
dunno about rstudio
most importantly we need to use what's already present
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
okay, so I put a 3 second wait there for now... we may alternate this in the future or remove eventually
bba8309
to
cff8e0d
Compare
cff8e0d
to
bb25555
Compare
bb25555
to
8bf8ab0
Compare
This adds a new check for the content of the workbench startup logs searching for some keywords that may indicate some problem or issue.
8bf8ab0
to
466b468
Compare
Thank you for your preliminary review, guys. I addressed all comments hopefully and marked this as ready for review now. Also, I'll wait with merging this until the https://issues.redhat.com/browse/RHOAIENG-18930 is addressed also, otherwise we would have failing tests in our CI in the meantime. |
/lgtm |
Okay, so the fix for the Jupyter is prepared here #889. I would go with the following steps:
|
/lgtm |
Thank you for your reviews and comments, guys! /approve |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: jstourac The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
/override ci/prow/images |
@jstourac: Overrode contexts on behalf of jstourac: build (cuda-jupyter-tensorflow-ubi9-python-3.11) / build (ubuntu-22.04), build (jupyter-pytorch-ubi9-python-3.11) / build (ubuntu-22.04), build (jupyter-trustyai-ubi9-python-3.11) / build (ubuntu-22.04), build (rocm-jupyter-pytorch-ubi9-python-3.11) / build (ubuntu-22.04), build (rocm-jupyter-tensorflow-ubi9-python-3.11) / build (ubuntu-22.04), ci/prow/images In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
8f2349b
into
opendatahub-io:main
This adds a new check for the content of the workbench startup logs searching for some keywords that may indicate some problem or issue.
https://issues.redhat.com/browse/RHOAIENG-18979
Description
How Has This Been Tested?
Merge criteria: