Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enable ci-integration tests on GHA (incl fixing a few tests) #5202

Merged
merged 13 commits into from
Mar 23, 2021

Conversation

cognifloyd
Copy link
Member

Move integration tests to GitHub Actions (GHA)

We also have to fix a few tests to make the transition.

List of commits:

  • improve CI output
  • improve GHA folding
  • fix virtualenv lock files issue again
  • Switch tests to check w/ CA that is present in GHA
  • apt download cache
  • use bitnami/rabbitmq service container in GHA
  • bust python cache
  • Use virtualenv from st2 virtualenv by default
  • Installing virtualenv in --user on github
  • Add workaround for failing stdin test
  • fix check-dependency-conflicts by updating pip-tools

cognifloyd and others added 11 commits March 22, 2021 19:22
- remove git clone progress reporting (reduce CI noise)
- report current user
- no random ports
- use separate rabbitmq.conf file on github
- ci: temporarily disable all but ci-integration
- ci: cp ssl_certs into rabbitmq container
- reenable management plugin
- manual custom.conf
- add python-version for ci-integration tests
- add default rabbitmq user/pass
- do not hardcode RMQ user/pass in tests
- override more rabbitmq test urls
- fix integration test conf files on the fly
- go back to guest:guest with rabbitmq
- finish reverting rmq guest:guest stuff
- Reconfigure RabbitMQ for ci-unit tests as well
If a virtualenv was built with symlinks instead of copies of the python
binary, as happens by default on GHA integration tests, then we end up
using the system virtualenv binary instead of what we installed.

Currently GHA has the same version of virtualenv, but we build
virtualenvs with --system-site-packages, and we need it to use the st2
virtualenv as the "system" dir instead of the GHA provided one. So,
don't get the realpath of the python binary.

Also, fall back to system virtualenv bin for backwards compat.
- install virtualenv in --user
- pip uninstall --user
- Try to recreate failure on Travis
- leave TODO/FIXME around broken test
- Use close_fds=True.
- add workaround for failing test
When we updated pip, we missed updating pip-tools to support the newer
pip version, so there were errors (that didn't fail the build, oops)
like these (fixed in pip-tools 5.4, pip-tools-PR-1216:

virtualenv/bin/pip-compile req.txt

Traceback (most recent call last):
  File "virtualenv/bin/pip-compile", line 8, in <module>
    sys.exit(cli())
  File "/home/runner/work/st2/st2/virtualenv/lib/python3.6/site-packages/click/core.py", line 829, in __call__
    return self.main(*args, **kwargs)
  File "/home/runner/work/st2/st2/virtualenv/lib/python3.6/site-packages/click/core.py", line 782, in main
    rv = self.invoke(ctx)
  File "/home/runner/work/st2/st2/virtualenv/lib/python3.6/site-packages/click/core.py", line 1066, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/home/runner/work/st2/st2/virtualenv/lib/python3.6/site-packages/click/core.py", line 610, in invoke
    return callback(*args, **kwargs)
  File "/home/runner/work/st2/st2/virtualenv/lib/python3.6/site-packages/click/decorators.py", line 21, in new_func
    return f(get_current_context(), *args, **kwargs)
  File "/home/runner/work/st2/st2/virtualenv/lib/python3.6/site-packages/piptools/scripts/compile.py", line 458, in cli
    results = resolver.resolve(max_rounds=max_rounds)
  File "/home/runner/work/st2/st2/virtualenv/lib/python3.6/site-packages/piptools/resolver.py", line 173, in resolve
    has_changed, best_matches = self._resolve_one_round()
  File "/home/runner/work/st2/st2/virtualenv/lib/python3.6/site-packages/piptools/resolver.py", line 278, in _resolve_one_round
    their_constraints.extend(self._iter_dependencies(best_match))
  File "/home/runner/work/st2/st2/virtualenv/lib/python3.6/site-packages/piptools/resolver.py", line 388, in _iter_dependencies
    dependencies = self.repository.get_dependencies(ireq)
  File "/home/runner/work/st2/st2/virtualenv/lib/python3.6/site-packages/piptools/repositories/local.py", line 75, in get_dependencies
    return self.repository.get_dependencies(ireq)
  File "/home/runner/work/st2/st2/virtualenv/lib/python3.6/site-packages/piptools/repositories/pypi.py", line 232, in get_dependencies
    download_dir, ireq, wheel_cache
  File "/home/runner/work/st2/st2/virtualenv/lib/python3.6/site-packages/piptools/repositories/pypi.py", line 163, in resolve_reqs
    wheel_download_dir=self._wheel_download_dir,
TypeError: make_requirement_preparer() got an unexpected keyword argument 'wheel_download_dir'
@cognifloyd cognifloyd requested a review from blag March 23, 2021 04:40
@pull-request-size pull-request-size bot added the size/L PR that changes 100-499 lines. Requires some effort to review. label Mar 23, 2021
@cognifloyd cognifloyd added this to the 3.5.0 milestone Mar 23, 2021
@cognifloyd cognifloyd requested review from nmaludy and Kami March 23, 2021 04:41
@cognifloyd
Copy link
Member Author

Historical note:
I built this in st2sandbox#1 then squashed/reordered the 100+ commits into a few discreet commits in st2sandbox#2. In squashing/reordering commits, I removed most of the history of places I added various forms of debug logging.

Once this is merged, a maintainer should disable StackStorm/st2 on travis-ci.com. A future PR can clean up the repo, removing travis config, and getting rid of anything else that is travis-specific.

@cognifloyd
Copy link
Member Author

failure in integration tests is one of the orquesta tests that have been flaky lately. Please restart GitHub Actions workflows.

We still need the symlink in ~/.local/bin after restoring the cached
~/virtualenv directory. Also, be more verbose by printing the commands.
@cognifloyd cognifloyd force-pushed the ci-integration-to-gha branch from 5732597 to 3193858 Compare March 23, 2021 07:00
@cognifloyd
Copy link
Member Author

Fixes a bug with the cache. All tests pass now.

sudo service rabbitmq-server restart
sleep 5
sudo rabbitmq-plugins enable rabbitmq_management
sudo cp scripts/github/rabbitmq.conf /home/runner/rabbitmq_conf/custom.conf
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the future I think it would be better to move that code in scripts/ci/setup-rabbitmq.sh or similar so we can also run shell check, etc on those files.

git clone https://github.com/StackStorm/st2tests.git
echo Cloning https://github.com/StackStorm/st2tests.git
# -q = no progress reporting (better for CI). Errors will still print.
git clone -q https://github.com/StackStorm/st2tests.git
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess we could even use --depth 1 // swallow clone to speed things up, but then again that repo is not that bug.

@@ -16,6 +16,9 @@ st2 --version
# Clean up old screen log files
rm -f logs/screen-*.log

# ::group::/::endgroup:: is helpful github actions syntax to fold this section.
echo ::group::launchdev.sh start -x
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As per comment above since that script is now specific to gha, we should probably rename the directory and update the affected files :)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not very difficult to make this work across multiple CI providers (like I did in ansible/molecule#2976 ), but I didn't see the point since this will only be running in GHA for now.

As I don't want to disable Travis until after this PR is merged, I minimized my changes to the travis scripts. Once we disable travis, though, there's a lot of code on the chopping block.

@Kami
Copy link
Member

Kami commented Mar 23, 2021

Thanks for doing this, it's great to how all the tests now run on GHA.

As far as orquesta test failures go - yeah, it appears that some orquesta integration tests which were failing quite often on Travis are also failing quite often on GHA.

I already assumed before it's some kind of race condition and now it looks even more likely.

@m4dcoder Do you think setting up a coordination backend for integration tests would make a difference with those race conditions / occasional failures? Or it's likely some other root cause we should look at improving in tests / similar?

Here are some examples:

@arm4b arm4b added the feature label Mar 23, 2021
@arm4b
Copy link
Member

arm4b commented Mar 23, 2021

Wohoo! Congrats @cognifloyd making that work, great achievement here that helps everyone a lot! 🤘

Copy link
Member

@arm4b arm4b left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

✔️ integration tests for the GH Actions is the best indicator for me 👍
Great work!

Quick notes:
ci.yaml will need a bit of cleanup from the irrelevant comments and notes. Can we also remove Travis.yml?

@cognifloyd
Copy link
Member Author

cognifloyd commented Mar 23, 2021

Agreed, we need to do quite a bit of cleanup to get rid of all the remnants of travis.

The scope of this PR is to do the bare minimum to get it working in GHA. I do not want to disable Travis in this PR to demonstrate that it works in both CI systems. Once merged, a Maintainer can go into the dashboard on travis-ci.com to disable running Travis on this repo.

Also, I tried to document in ci.yaml why I made various decisions so that when we do cleanup that file, we don't inadvertently break things by going back to something that didn't work. That said, there is a lot of room for improvement with the ci files. I definitely want to see better organization with some scripts moving out of the workflow definition and others just going away as we make them obsolete.

Please, let's save a lot of this cleanup for future PRs.

# This could indicate that parent process (e.g. process which runs the tests has
# incorrectly opened the stdin and that one is then inherited by the process which is
# spawning it which will cause issues)
raise ValueError("Received no valid parameters data from sys.stdin")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not a blocker, but noting this in the future: it would be a good idea to log what already has been read into stdin_data.

Suggested change
raise ValueError("Received no valid parameters data from sys.stdin")
raise ValueError(f"Received no valid parameters data from sys.stdin:\n{stdin_data}")

@blag blag merged commit 007beed into StackStorm:master Mar 23, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature infrastructure: ci/cd size/L PR that changes 100-499 lines. Requires some effort to review.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants