Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ascola/add parallel testing #5852

Open
wants to merge 8 commits into
base: master
Choose a base branch
from

Conversation

AustinScola
Copy link
Contributor

@AustinScola AustinScola commented Jul 3, 2021

What do these changes do?

  • Add test dependency on pytest-xdist
  • Re-compile the dev dependencies with pip-compile
    • (NOTE: Someone should double check this because I'm not 100% sure I did it correctly? I was using Python 3.7 and there were some changes I wouldn't have expected that got introduced into the dev.txt file.)
  • Add a parallel testing command make xtest which uses pytest-xdist
  • Update the contributing documentation to mention the make xtest command

Are there changes in behavior for the user?

When developing aiohttp, make xtest can now be used to run tests in parallel. This should be significantly faster than running the tests sequentially.

Related issue number

Resolves #5849

Checklist

  • I think the code is well written
  • Unit tests for the changes exist
  • Documentation reflects the changes
  • If you provide code modification, please add yourself to CONTRIBUTORS.txt
    • The format is <Name> <Surname>.
    • Please keep alphabetical order, the file is sorted by names.
  • Add a new news fragment into the CHANGES folder
    • name it <issue_id>.<type> for example (588.bugfix)
    • if you don't have an issue_id change it to the pr id after creating the pr
    • ensure type is one of the following:
      • .feature: Signifying a new feature.
      • .bugfix: Signifying a bug fix.
      • .doc: Signifying a documentation improvement.
      • .removal: Signifying a deprecation or removal of public API.
      • .misc: A ticket has been closed, but it is not of interest to users.
    • Make sure to use full sentences with correct case and punctuation, for example: "Fix issue with non-ascii contents in doctest text files."

Add a testing dependency on `pytest-xdist` for running pytests in
parallel. I'm not sure if I ran the `pip-compile` command correctly here
because there seem to be more changes than I would expect? In particular
I ran it with Python3.7 but maybe I should have run it with Python3.8?
Add an `xtest` Makefile target for running pytests in parallel. The
`n -auto` flag tells `pytest-xdist` to use as many processes as the CPU
cores that the machine has.
Update the contributing documentation to include the parallel testing
command.
@psf-chronographer psf-chronographer bot added the bot:chronographer:provided There is a change note present in this PR label Jul 3, 2021
@codecov
Copy link

codecov bot commented Jul 4, 2021

Codecov Report

Merging #5852 (b2a3977) into master (ae23fd6) will increase coverage by 0.03%.
The diff coverage is n/a.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #5852      +/-   ##
==========================================
+ Coverage   96.75%   96.78%   +0.03%     
==========================================
  Files          44       44              
  Lines        9851     9851              
  Branches     1591     1591              
==========================================
+ Hits         9531     9534       +3     
+ Misses        182      179       -3     
  Partials      138      138              
Flag Coverage Δ
unit 96.68% <ø> (+0.03%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
aiohttp/worker.py 97.45% <0.00%> (+2.54%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update ae23fd6...b2a3977. Read the comment docs.

@@ -141,6 +141,12 @@ Please take a look on the produced output.

Any extra texts (print statements and so on) should be removed.

Tests can also be run in parallel with:

.. code-block:: shell
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should actually be something like console. shell is for shell-scripts but console recognizes prompts (like $ or >) with commands and their outputs. See the pygments lexers lists for more details.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would also mention that parallel run is experimental in the documentation unless it become stable.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fair enough. Though per my observations, locally it's mostly stable (unless you have a million chrome tabs plus something heavy running in the background like I do) but in CI, weird random failures will be happening more often.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I fixed the lexer spec though it now is inconsistent will all the other lexers in the file (not sure if we want to fix this here or in another PR?) Also added a note about it being experimental.

Copy link
Member

@webknjaz webknjaz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I attempted implementing this in the past, back when Travis CI was a thing. And there were flaky problems with certain envs so we reverted this in the end. Earlier this year, I tried again. I managed to address many problems but not all. See #5431 for that last try.

TL;DR The I/O loop must always run in the main thread in order for the signal handling to work. With pytest-xdist it's not always the case because sometimes the underlying lib execnet occupies that main thread with something else in an inconsistent manner. It now happens very rarely and is hard to reproduce — on my laptop, I had to make like 12-15 terminal tabs and invoke parallelized pytest in all of them (almost) simultaneously. Please follow the pointers in #5431 for more details.

In my last draft (#5431), I've tried auto-enabling pytest-xdist automatically under Python3.8+ with a new watcher that is capable of handling signals in threads. It is now on hold as I don't have time to continue exploring the problem at the moment but feel free to cherry-pick my commits into your branch (and maybe rebase on top) and try to check how well that'd work by overloading your system with many parallel processes and threads (that underlying race condition is most likely happen in such an env). Also, make sure to transfer any explanations/pointers from my PR so that all the info is consolidated here.

Lastly, I wanted to mention that I like the approach of having a separate make target. Ideally, though, the parallelized test run should be made default at some point and used in CI where any speed-ups are very welcome as well.

@webknjaz
Copy link
Member

webknjaz commented Jul 7, 2021

FTR just merged PR #5862 should've made xdist runs more stable on Python 3.8+ thanks to @sweatybridge. So I encourage everybody involved to test how it performs under heavy load + rebase this PR.

@AustinScola
Copy link
Contributor Author

@webknjaz Thanks for the background information. I was unaware of the previous effort. After the merge I haven't seen any signs of flakiness?

I also agree that it would be nice for this to be the default behavior (eventually).

@sweatybridge
Copy link
Contributor

@webknjaz I've drafted a proposal to add a --aiohttp-skip-watcher flag so that users "don't pay for what they don't use". Afaik, there's only 1 test in aiohttp that requires child watcher, ie. test_loop.py::test_subprocess_co. So I think it's fair to run this test separately in its own loop, on the main thread, and with SafeChildWatcher attached. The rest majority of users, including aiohttp tests, can switch to pytest-xdist without incurring this cost. For eg,

.PHONY: xtest
xtest: .develop
	@pytest -n auto -q --aiohttp-skip-watcher
	@pytest -q tests/test_loop.py::test_subprocess_co

Linking the PR here for comments.

@sweatybridge
Copy link
Contributor

test setup failed
loop_factory = <class 'asyncio.unix_events._UnixDefaultEventLoopPolicy'>
fast = False, loop_debug = False
@pytest.fixture
def loop(loop_factory, fast, loop_debug):
"""
Custom loop object that's safe to initialise from non-main thread.
Same implementation as aiohttp.pytest_plugin.loop but monkey patched to use
MultiLoopChildWatcher instead.
"""
policy = loop_factory()
asyncio.set_event_loop_policy(policy)
with patch("asyncio.SafeChildWatcher", new=asyncio.MultiLoopChildWatcher):
>           with loop_context(fast=fast) as _loop:
fast       = False
loop_debug = False
loop_factory = <class 'asyncio.unix_events._UnixDefaultEventLoopPolicy'>
policy     = <asyncio.unix_events._UnixDefaultEventLoopPolicy object at 0x7fce4d6c30d0>
unit_tests/conftest.py:290:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
/usr/local/lib/python3.8/contextlib.py:113: in __enter__
return next(self.gen)
self       = <contextlib._GeneratorContextManager object at 0x7fce4d6c3250>
/usr/local/lib/python3.8/site-packages/aiohttp/test_utils.py:493: in loop_context
loop = setup_test_loop(loop_factory)
fast       = False
loop_factory = <function new_event_loop at 0x7fce84a3a5e0>
/usr/local/lib/python3.8/site-packages/aiohttp/test_utils.py:518: in setup_test_loop
watcher.attach_loop(loop)
loop       = <_UnixSelectorEventLoop running=False closed=False debug=False>
loop_factory = <function new_event_loop at 0x7fce84a3a5e0>
module     = 'asyncio.unix_events'
policy     = <asyncio.unix_events._UnixDefaultEventLoopPolicy object at 0x7fce4d6c30d0>
skip_watcher = False
watcher    = <asyncio.unix_events.MultiLoopChildWatcher object at 0x7fce4d6c33d0>
/usr/local/lib/python3.8/asyncio/unix_events.py:1193: in attach_loop
self._saved_sighandler = signal.signal(signal.SIGCHLD, self._sig_chld)
loop       = <_UnixSelectorEventLoop running=False closed=False debug=False>
self       = <asyncio.unix_events.MultiLoopChildWatcher object at 0x7fce4d6c33d0>
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
signalnum = <Signals.SIGCHLD: 17>
handler = <bound method MultiLoopChildWatcher._sig_chld of <asyncio.unix_events.MultiLoopChildWatcher object at 0x7fce4d6c33d0>>
@_wraps(_signal.signal)
def signal(signalnum, handler):
>       handler = _signal.signal(_enum_to_int(signalnum), _enum_to_int(handler))
E       ValueError: signal only works in main thread
handler    = <bound method MultiLoopChildWatcher._sig_chld of <asyncio.unix_events.MultiLoopChildWatcher object at 0x7fce4d6c33d0>>
signalnum  = <Signals.SIGCHLD: 17>
/usr/local/lib/python3.8/signal.py:47: ValueError

I tested MultiLoopChildWatcher in our CI and found the same issue with signals not working on main thread in 1 out of 20 runs. We have about 900 unit tests distributed on 8 cores.

I will change to ThreadedChildWatcher and test again.

@AustinScola
Copy link
Contributor Author

@sweatybridge does this PR require #5877 to be merged first?

@webknjaz
Copy link
Member

FTR it looks like this PR is going to end up being covered by #5431. There's been updates to pytest-xdist and execnet it uses that broke things yesterday. But once those are polished, it seems like we might have a way forward finally.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bot:chronographer:provided There is a change note present in this PR
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Use pytest-xdist?
4 participants