RHOAIENG-18979: chore(test/containers): check workbench startup logs #886

jstourac · 2025-02-05T10:19:04Z

This adds a new check for the content of the workbench startup logs searching for some keywords that may indicate some problem or issue.

https://issues.redhat.com/browse/RHOAIENG-18979

Description

How Has This Been Tested?

DOCKER_HOST=unix:///run/user/$UID/podman/podman.sock poetry run pytest tests/containers --image quay.io/modh/odh-generic-data-science-notebook@sha256:0efbb3ad6f8f342360cf1f002d40716a39d4c58f69163e053d5bd19b4fe732d4 -k test_image_entrypoint_starts

DOCKER_HOST=unix:///run/user/$UID/podman/podman.sock poetry run pytest tests/containers --image quay.io/opendatahub/workbench-images@sha256:27d87f50f7aad6185344368b1b00c446773cff12bbea2f87e8d0c4ece2b13d87 -k test_image_entrypoint_starts

Merge criteria:

The commits are squashed in a cohesive manner and have meaningful messages.
Testing instructions have been added in the PR body (for PRs involving changes that are not immediately obvious).
The developer has manually tested the changes and verified that the changes work

openshift-ci · 2025-02-05T10:19:08Z

Skipping CI for Draft Pull Request.
If you want CI signal for your change, please convert it to an actual PR.
You can still manually trigger a test run with /test all

jiridanek · 2025-02-05T11:58:57Z

tests/containers/workbenches/workbench_image_test.py

+                        for keyword in blocked_keywords:
+                            if keyword in line:
+                                if not any(allowed in line for allowed in allowed_messages):
+                                    logging.debug(f"Unexpected keyword '{keyword}' in the following message: '{line}'")


logging.info? or maybe logging.warning?

you are right, I updated it to error actually

jiridanek · 2025-02-05T12:01:01Z

tests/containers/workbenches/workbench_image_test.py

+
+                # let's check that logs don't contain any error or unexpected warning message
+                with subtests.test("Checking the log in the workbench..."):
+                    failed_lines = 0


I have three ideas to avoid counting lines ourselves

when there is a failure, do a with subtests.test(): assert ..., or with subtests.test(): pytest.fail(...), that is, create a subtest for every failed test and immediately fail it

there is https://github.com/okken/pytest-check

and there is https://github.com/astraw38/pytest-assume

don't really have a preference, this counter is not exactly a bad solution either

Okay, I kept this as one test since we probably don't want different number of subtests depending on how many log messages are there 🤔 I chose to alternate this part of code based on your other suggestion.

jiridanek · 2025-02-05T12:07:41Z

tests/containers/workbenches/workbench_image_test.py

+                        for keyword in blocked_keywords:
+                            if keyword in line:
+                                if not any(allowed in line for allowed in allowed_messages):
+                                    logging.debug(f"Unexpected keyword '{keyword}' in the following message: '{line}'")
+                                    failed_lines += 1
+                                    # Let's go on another line now as we don't care if multiple blocked keywords are present in this line
+                                    break


just this small idea that saves one level of indentation and comment is no longer necessary imo

Suggested change

for keyword in blocked_keywords:

if keyword in line:

if not any(allowed in line for allowed in allowed_messages):

logging.debug(f"Unexpected keyword '{keyword}' in the following message: '{line}'")

failed_lines += 1

# Let's go on another line now as we don't care if multiple blocked keywords are present in this line

break

if any(keyword in line for keyword in blocked_keywords):

if not any(allowed in line for allowed in allowed_messages):

logging.debug(f"Unexpected keyword '{keyword}' in the following message: '{line}'")

failed_lines += 1

This has a disadvantage as I am not able to print the keyword that caused this to fail... but yeah, I'll probably amend that this way.

This has a disadvantage as I am not able to print the keyword that caused this to fail...

you could run this twice maybe, when you have wrong line, do this because it's quick and easy

line = "error info warning" blocked_keywords = ["error", "warning"] print("failures:", {keyword: keyword in line for keyword in blocked_keywords}) >>> failures: {'error': True, 'warning': True}

that would be too verbose... let's not complicate it for now, we can always improve

[Checking the log in the workbench...] SUBFAIL tests/containers/workbenches/workbench_image_test.py::TestWorkbenchImage::test_image_entrypoint_starts[ghcr.io/opendatahub-io/notebooks/workbench-images:jupyter-trustyai-ubi9-python-3.11-889_merge_ad1bfa2899f33cd9962985d35165c6b6b3b5caf3-sysctls1] - Failed: Log message(s) (3) that violate our checks occurred during the workbench startup: [W 2025-02-06 08:26:45.534 LabApp] Failed to instantiate the extension manager pypi. Falling back to read-only manager. Traceback (most recent call last): TypeError: AsyncClient.__init__() got an unexpected keyword argument 'proxies'

This is good output, I like it. From https://github.com/opendatahub-io/notebooks/actions/runs/13174582770/job/36771027197?pr=889#step:25:151

jiridanek · 2025-02-05T12:10:20Z

tests/containers/workbenches/workbench_image_test.py

        skip_if_not_workbench_image(image)

+        # Here is a list of blocked keywords we don't want to see in the log messages during the container/workbench


maybe put all this into a helper function (logs and subtests object can be a parameter to it) and also call it from the ipv6 only test below?

that is a good point, didn't think about that - updated per your suggestion now

jiridanek · 2025-02-05T12:13:04Z

tests/containers/workbenches/workbench_image_test.py

+
+                # let's check that logs don't contain any error or unexpected warning message
+                with subtests.test("Checking the log in the workbench..."):
+                    failed_lines = 0


on second thought, maybe I'd make this

Suggested change

failed_lines = 0

failed_lines: list[str] = []

and .append to it the failed lines

why should people hunt for them in the log when they can be kept here and produced in the final failure message

yup, I like it - updated

jiridanek · 2025-02-05T13:30:07Z

tests/containers/workbenches/workbench_image_test.py

            finally:
                # try to grab logs regardless of whether container started or not
+                # TODO - do we want to enforce couple of seconds wait to be sure nothing extra suspicious is logged for the workbench???


either that, or possibly we may even want to try to use jupyterlab a bit

every playwright test that we're going to eventually create could be checking logs like this, even, maybe

future work

minor nit - but i thought we were told we needed to write cypress tests (?)

jupyterlab and codeserver upstreams use playwright

elyra and odh dashboard (and notebooks 2.00 use cypress

dunno about rstudio

most importantly we need to use what's already present

okay, so I put a 3 second wait there for now... we may alternate this in the future or remove eventually

This adds a new check for the content of the workbench startup logs searching for some keywords that may indicate some problem or issue.

jstourac · 2025-02-06T07:25:37Z

Thank you for your preliminary review, guys. I addressed all comments hopefully and marked this as ready for review now.

Also, I'll wait with merging this until the https://issues.redhat.com/browse/RHOAIENG-18930 is addressed also, otherwise we would have failing tests in our CI in the meantime.

jiridanek · 2025-02-06T08:35:11Z

/lgtm

jstourac · 2025-02-06T09:18:37Z

Okay, so the fix for the Jupyter is prepared here #889. I would go with the following steps:

merge this thing
then rebase the RHOAIENG-18930: update Jupyterlab package to 4.2.7 #889 and follow there

atheo89 · 2025-02-07T08:08:19Z

/lgtm

jstourac · 2025-02-07T14:02:12Z

Thank you for your reviews and comments, guys!

/approve

openshift-ci · 2025-02-07T14:02:19Z

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: jstourac

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

~~OWNERS~~ [jstourac]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

jstourac · 2025-02-07T14:05:46Z

/override ci/prow/images
/override "build (rocm-jupyter-tensorflow-ubi9-python-3.11) / build (ubuntu-22.04)"
/override "build (rocm-jupyter-pytorch-ubi9-python-3.11) / build (ubuntu-22.04)"
/override "build (jupyter-pytorch-ubi9-python-3.11) / build (ubuntu-22.04)"
/override "build (jupyter-trustyai-ubi9-python-3.11) / build (ubuntu-22.04)"
/override "build (cuda-jupyter-tensorflow-ubi9-python-3.11) / build (ubuntu-22.04)"

openshift-ci · 2025-02-07T14:05:56Z

@jstourac: Overrode contexts on behalf of jstourac: build (cuda-jupyter-tensorflow-ubi9-python-3.11) / build (ubuntu-22.04), build (jupyter-pytorch-ubi9-python-3.11) / build (ubuntu-22.04), build (jupyter-trustyai-ubi9-python-3.11) / build (ubuntu-22.04), build (rocm-jupyter-pytorch-ubi9-python-3.11) / build (ubuntu-22.04), build (rocm-jupyter-tensorflow-ubi9-python-3.11) / build (ubuntu-22.04), ci/prow/images

In response to this:

/override ci/prow/images
/override "build (rocm-jupyter-tensorflow-ubi9-python-3.11) / build (ubuntu-22.04)"
/override "build (rocm-jupyter-pytorch-ubi9-python-3.11) / build (ubuntu-22.04)"
/override "build (jupyter-pytorch-ubi9-python-3.11) / build (ubuntu-22.04)"
/override "build (jupyter-trustyai-ubi9-python-3.11) / build (ubuntu-22.04)"
/override "build (cuda-jupyter-tensorflow-ubi9-python-3.11) / build (ubuntu-22.04)"

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

jstourac requested a review from jiridanek February 5, 2025 10:19

jstourac self-assigned this Feb 5, 2025

openshift-ci bot added the do-not-merge/work-in-progress label Feb 5, 2025

openshift-ci bot added size/m and removed size/m labels Feb 5, 2025

jiridanek reviewed Feb 5, 2025

View reviewed changes

jstourac force-pushed the checkWorkbenchStartupLogs branch from 96ccd47 to bba8309 Compare February 5, 2025 13:15

openshift-ci bot added size/m and removed size/m labels Feb 5, 2025

jiridanek reviewed Feb 5, 2025

View reviewed changes

jstourac force-pushed the checkWorkbenchStartupLogs branch from bba8309 to cff8e0d Compare February 6, 2025 06:33

openshift-ci bot added size/m and removed size/m labels Feb 6, 2025

jstourac force-pushed the checkWorkbenchStartupLogs branch from cff8e0d to bb25555 Compare February 6, 2025 06:40

openshift-ci bot added size/m and removed size/m labels Feb 6, 2025

jstourac force-pushed the checkWorkbenchStartupLogs branch from bb25555 to 8bf8ab0 Compare February 6, 2025 07:03

RHOAIENG-18979: chore(test/containers): check workbench startup logs

466b468

This adds a new check for the content of the workbench startup logs searching for some keywords that may indicate some problem or issue.

openshift-ci bot added size/m and removed size/m labels Feb 6, 2025

jstourac force-pushed the checkWorkbenchStartupLogs branch from 8bf8ab0 to 466b468 Compare February 6, 2025 07:03

openshift-ci bot added size/m and removed size/m labels Feb 6, 2025

jstourac marked this pull request as ready for review February 6, 2025 07:05

openshift-ci bot removed the do-not-merge/work-in-progress label Feb 6, 2025

openshift-ci bot requested review from atheo89 and harshad16 February 6, 2025 07:05

openshift-ci bot assigned jiridanek Feb 6, 2025

openshift-ci bot added the lgtm label Feb 6, 2025

openshift-ci bot assigned atheo89 Feb 7, 2025

openshift-ci bot added the approved label Feb 7, 2025

openshift-merge-bot bot merged commit 8f2349b into opendatahub-io:main Feb 7, 2025
18 of 23 checks passed

jstourac deleted the checkWorkbenchStartupLogs branch February 7, 2025 14:19

jstourac mentioned this pull request Feb 7, 2025

RHOAIENG-18930: update Jupyterlab package to 4.2.7 #889

Merged

3 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

RHOAIENG-18979: chore(test/containers): check workbench startup logs #886

RHOAIENG-18979: chore(test/containers): check workbench startup logs #886

jstourac commented Feb 5, 2025 •

edited

Loading

openshift-ci bot commented Feb 5, 2025

jiridanek Feb 5, 2025

jstourac Feb 6, 2025

jiridanek Feb 5, 2025 •

edited

Loading

jstourac Feb 6, 2025

jiridanek Feb 5, 2025

jstourac Feb 5, 2025

jiridanek Feb 5, 2025

jstourac Feb 6, 2025

jiridanek Feb 6, 2025

jiridanek Feb 5, 2025

jstourac Feb 6, 2025

jiridanek Feb 5, 2025 •

edited

Loading

jstourac Feb 6, 2025

jiridanek Feb 5, 2025

andyatmiami Feb 5, 2025

jiridanek Feb 5, 2025 •

edited

Loading

jstourac Feb 6, 2025

jstourac commented Feb 6, 2025

jiridanek commented Feb 6, 2025

jstourac commented Feb 6, 2025

atheo89 commented Feb 7, 2025

jstourac commented Feb 7, 2025

openshift-ci bot commented Feb 7, 2025

jstourac commented Feb 7, 2025

openshift-ci bot commented Feb 7, 2025

		skip_if_not_workbench_image(image)

		# Here is a list of blocked keywords we don't want to see in the log messages during the container/workbench

RHOAIENG-18979: chore(test/containers): check workbench startup logs #886

RHOAIENG-18979: chore(test/containers): check workbench startup logs #886

Conversation

jstourac commented Feb 5, 2025 • edited Loading

Description

How Has This Been Tested?

Merge criteria:

openshift-ci bot commented Feb 5, 2025

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jiridanek Feb 5, 2025 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jiridanek Feb 5, 2025 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jiridanek Feb 5, 2025 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jstourac commented Feb 6, 2025

jiridanek commented Feb 6, 2025

jstourac commented Feb 6, 2025

atheo89 commented Feb 7, 2025

jstourac commented Feb 7, 2025

openshift-ci bot commented Feb 7, 2025

jstourac commented Feb 7, 2025

openshift-ci bot commented Feb 7, 2025

jstourac commented Feb 5, 2025 •

edited

Loading

jiridanek Feb 5, 2025 •

edited

Loading

jiridanek Feb 5, 2025 •

edited

Loading

jiridanek Feb 5, 2025 •

edited

Loading