[Testing] Add debugging feature which leaves integration test containers running after test completes #9626

lhotari · 2021-02-19T07:28:08Z

Motivation

For debugging purposes, it is useful to have the ability to leave the integration test containers running after the test completes. For example, this feature was necessary in investigating the issue #9622 . It was possible to view the log files and find out the issue. If the containers are killed, this options is lost.

Modifications

Adds handling to the initialization and stopping of Pulsar containers and Pulsar cluster so that containers get configured using Testcontainers "reuse mode" which leaves containers running after the test JVM stops. Normally the Testcontainers automatic container cleanup feature stops all containers which weren't explicitly stopped during the tests.
Testcontainers reuse mode must be enabled by setting environment variable TESTCONTAINERS_REUSE_ENABLE=true (or by setting testcontainers.reuse.enable=true in ~/.testcontainers.properties).

The modifications in this PR skip stopping PulsarContainer and PulsarCluster instances if environment variable PULSAR_CONTAINERS_LEAVE_RUNNING=true .

Usage example

In unix shells, one can pass environment variables by prepending the command with the variables, for example:

PULSAR_CONTAINERS_LEAVE_RUNNING=true TESTCONTAINERS_REUSE_ENABLE=true mvn -B -f tests/pom.xml test -DintegrationTests -DredirectTestOutputToFile=false -DtestRetryCount=0 -DfailIfNoTests=false -Dtest=CLITest#testCreateSubscriptionCommand

After the test run, one can use docker ps and docker exec -it [container_name] bash to get a shell in the running container that was left behind the test run when this feature introduced by this PR is active.

After debugging, one can use this command to kill all containers that were left running:

docker kill $(docker ps -q --filter "label=pulsarcontainer=true")

(the solution in this PR labels the containers with "pulsarcontainer=true")

eolivelli

it will save lot of time while working on integration tests!

LGTM

zymap

LGTM.

It would be wonderful if we can add the description into the test README.md, that would be helpful for others who want to debug the tests.

lhotari · 2021-02-19T10:28:38Z

It would be wonderful if we can add the description into the test README.md, that would be helpful for others who want to debug the tests.

@zymap Makes sense. I was thinking of updating the README later. I have plans to add features for enabling attaching a debugger to a container as well as enabling the use of JConsole / Java Mission Control / Java Flight Recorder by enabling JMX for the JVMs in a configurable way. The README could be updated at that time to also cover the feature of this current PR.

lhotari · 2021-02-19T13:50:02Z

/pulsarbot run-failure-checks

bsideup · 2021-02-19T15:09:23Z

@lhotari TBH I would advice against this approach. There is no guarantee that the JVM hook will be using stop() (we thought about terminating multiple containers by the session label filter). Not to mention that Ryuk generally should not be disabled, unless your CI does not support it (the reason why the flag was added on a first place).

If you need to the container to remain running, consider using the reusable containers mode.

lhotari · 2021-02-19T15:22:56Z

@bsideup Thanks for your comments. In this case, the changes in this PR aren't meant to be used in CI at all. The reason to leave the container running is to debug an issue locally. I've had the impression that it's the reason why TESTCONTAINERS_RYUK_DISABLED=true exists. At least, that's how I've been using it in the past.

Would you also recommend using the reusable container mode for also for the debugging use case that I have described? I guess that would require also enabling testcontainers.reuse.enable in ~/.testcontainers.properties?

I have one concern. The problem with reusable container mode is that it impacts the execution when a container gets reused. I'd simply like to leave the containers after the test and the test JVM completes so that I could open a shell to the container and inspect the state. Perhaps this would be a feature request for Testcontainers?
Is there a workaround with reusable containers by adding a random label to the container etc so that the container would never get reused but just left behind?

bsideup · 2021-02-19T15:52:14Z

@lhotari

I've had the impression that it's the reason why TESTCONTAINERS_RYUK_DISABLED=true exists

We added the flag because BitBucket did not support mounting Docker socket a few years ago, but, since the builds were ephemeral, Ryuk could be omitted.

Is there a workaround with reusable containers by adding a random label to the container etc so that the container would never get reused but just left behind?

Yes. The reusable feature works by hashing the container's definition. If you make the hash unique (e.g. random label / env variable / network alias / network id) then a new container will be started each execution and old containers won't be terminated (one can see it as a disadvantage but in your case this is exactly what you need :D)

lhotari · 2021-02-19T16:04:33Z

Thanks for the advice @bsideup . I'll revisit this PR so that it doesn't depend on TESTCONTAINERS_RYUK_DISABLED=true and uses the reuse containers feature as part of the solution. That's cleaner.

…g after test completes - For debugging purposes, it is useful to have the ability to leave containers running. This mode can be activated by setting environment variables PULSAR_CONTAINERS_LEAVE_RUNNING=true and TESTCONTAINERS_REUSE_ENABLE=true - in this case, use this command afterwards to kill containers that were left running: docker kill $(docker ps -q --filter "label=pulsarcontainer=true")

lhotari · 2021-02-22T10:03:56Z

@bsideup I have revisited the solution in this PR to use the Testcontainers reuse mode instead of depending on the usage of TESTCONTAINERS_RYUK_DISABLED=true. Would you mind reviewing the changes?

lhotari · 2021-02-22T13:20:59Z

/pulsarbot run-failure-checks

bsideup · 2021-02-22T13:38:10Z

@lhotari it is okay :) I had a feeling that the stop() override is unnecessary but you know your usage better 👍

lhotari · 2021-02-22T14:54:35Z

/pulsarbot run-failure-checks

eolivelli approved these changes Feb 19, 2021

View reviewed changes

zymap reviewed Feb 19, 2021

View reviewed changes

zymap requested review from sijie, aahmed-se and codelipenghui February 19, 2021 09:10

zymap assigned lhotari Feb 19, 2021

zymap added the area/test label Feb 19, 2021

zymap added this to the 2.8.0 milestone Feb 19, 2021

lhotari marked this pull request as draft February 19, 2021 16:04

lhotari force-pushed the lh-add-testcontainers-debugging-feature branch from a0fd5d8 to f24d888 Compare February 22, 2021 09:50

lhotari mentioned this pull request Feb 23, 2021

[Testing] Improve integration test base classes #9672

Merged

lhotari marked this pull request as ready for review February 23, 2021 11:53

zymap approved these changes Feb 23, 2021

View reviewed changes

sijie approved these changes Feb 23, 2021

View reviewed changes

sijie merged commit 458bd91 into apache:master Feb 23, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Testing] Add debugging feature which leaves integration test containers running after test completes #9626

[Testing] Add debugging feature which leaves integration test containers running after test completes #9626

lhotari commented Feb 19, 2021 •

edited

Loading

eolivelli left a comment

zymap left a comment

lhotari commented Feb 19, 2021

lhotari commented Feb 19, 2021

bsideup commented Feb 19, 2021

lhotari commented Feb 19, 2021 •

edited

Loading

bsideup commented Feb 19, 2021

lhotari commented Feb 19, 2021

lhotari commented Feb 22, 2021

lhotari commented Feb 22, 2021

bsideup commented Feb 22, 2021

lhotari commented Feb 22, 2021

[Testing] Add debugging feature which leaves integration test containers running after test completes #9626

[Testing] Add debugging feature which leaves integration test containers running after test completes #9626

Conversation

lhotari commented Feb 19, 2021 • edited Loading

Motivation

Modifications

Usage example

eolivelli left a comment

Choose a reason for hiding this comment

zymap left a comment

Choose a reason for hiding this comment

lhotari commented Feb 19, 2021

lhotari commented Feb 19, 2021

bsideup commented Feb 19, 2021

lhotari commented Feb 19, 2021 • edited Loading

bsideup commented Feb 19, 2021

lhotari commented Feb 19, 2021

lhotari commented Feb 22, 2021

lhotari commented Feb 22, 2021

bsideup commented Feb 22, 2021

lhotari commented Feb 22, 2021

lhotari commented Feb 19, 2021 •

edited

Loading

lhotari commented Feb 19, 2021 •

edited

Loading