Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimize tests following #167 slowdown #198

Merged
merged 30 commits into from
Sep 3, 2021

Conversation

cisaacstern
Copy link
Member

@cisaacstern cisaacstern commented Sep 2, 2021

To start I've just added a module parametrization_counter.py which returns diagnostic information on test runs.

More to follow. Closes #199 when complete.

@cisaacstern
Copy link
Member Author

#199 (comment) documents speed-ups in test_storage.py::test_file_opener after limiting input fixtures to sequential file paths only. Picking up that thread here with some insights on test_recipes.py::test_recipe_caching_copying following the same change.

From our current master

$ pytest test_recipes.py -k test_recipe_caching_copying
Results (252.91s):
     320 passed
     835 deselected

And from 076ed16

$ pytest test_recipes.py -k test_recipe_caching_copying
Results (103.09s):
     160 passed
     835 deselected

@cisaacstern cisaacstern marked this pull request as ready for review September 3, 2021 02:01
@cisaacstern
Copy link
Member Author

cisaacstern commented Sep 3, 2021

@rabernat, this PR shaves 15 mins off 3.8-build and 10 mins of the 3.9-build compared to master (where each is ~30min):

Screen Shot 2021-09-02 at 7 07 22 PM

We get there by refactoring the fixtures so we can use only sequential files/paths/patterns for the tests of moving files around, thereby cutting out a few hundred unnecessary tests. Here are the numbers of calls in each module, by PR:

module #174 #167 #198
test_chunk_grid 6 8 8
test_fixtures 6 16 22
test_locking 6 6 6
test_patterns 6 22 22
test_recipes 803 1155 995
test_references 8 8 8
test_storage 99 387 195
test_utils 2 2 2

And a visual look at test calls from the current PR (xref plot of test calls for master in #199 (comment)):

optimize-tests

test_recipes.py::test_chunks is still the major outlier, but there doesn't seem to be a way to reduce those calls by changing the recipe fixture, since it only uses local paths at it is, and it doesn't seem appropriate to deprive that test of multivariable files.

This seems to be good enough progress for now, without making this into a multi-day exercise, so I'm good to merge if you are. Also happy to dig a little deeper, though, if you see any other obvious places for improvement.

If we merge, do you want to commit parametrization_counter.py, which I used to make those plots? I'd probably use it if it was there, but if not I'll drop it in a gist. (It wasn't clear to me if any pytest or third-party options already existed for making visual diagnostics like that.)

@cisaacstern
Copy link
Member Author

On further reflection, I thought of a way to possibly cut the file transfer testing time in half again. I'll push that update later tonight or first thing tomorrow.

@cisaacstern
Copy link
Member Author

We're now basically back to where we were before #167 in terms of testing time and IMHO this is now ready to merge:

Screen Shot 2021-09-03 at 1 17 26 AM

Previously when I mentioned testing only the "sequential" files, both the "D" and "2D" interval files were included in that. Considering each http run requires three auth parameterizations (no auth, basic auth, query string auth), even that little bit of redundancy was adding up.

In the last few commits, I refactored the fixtures again to make this items_per_file option explicit in the local paths, thereby exposing just the "D" interval files to make our http path/pattern/recipe fixtures. I think I'm convinced that we don't need to test any other files aside from these over http, because the purpose of the http tests is really to ensure file opening/caching/authentication/etc. works, and we actually don't care about the content of the files in those tests (only that they get where they're going intact).

Similarly, I did change test_storage.py::test_file_opener so that it only runs with the local sequential files. This is a marginal speed increase over including all of the local files, but thought I'd go for it while I was at it. Again, IIUC, for file opening/transfer/etc. tests, diversity of internal contents in the files being opened and transferred shouldn't matter. (At least, for files of roughly the same size and format.)

do you want to commit parametrization_counter.py

Moved this out of the PR to https://github.com/cisaacstern/pytest-param-counter. And just because I like that plot, here's were we stand:

optimize-chunks-fewer-http-tests

@cisaacstern
Copy link
Member Author

cisaacstern commented Sep 3, 2021

Noting that this also turns on duration profiling for github pytest runs (per Ryan's suggestion) and that all 10 of the slowest tests are now some version of tests/test_recipes.py::test_chunks[prefect..., which was not the case prior to the work in this PR, as documented in #199 (comment). Working on this is out-of-scope for this PR, IMO, but good to keep in mind for further optimization down the line:

============================= slowest 10 durations =============================
3.30s call     tests/test_recipes.py::test_chunks[prefect-netCDFtoZarr_recipe-target_chunks6-False-error_expectation6-1-subset_inputs1-netcdf_local_file_pattern_sequential_multivariable-netcdf_local_paths_sequential_multivariable_1d]
3.17s call     tests/test_recipes.py::test_chunks[prefect-netCDFtoZarr_recipe-target_chunks5-False-error_expectation5-1-subset_inputs1-netcdf_local_file_pattern_sequential_multivariable-netcdf_local_paths_sequential_multivariable_1d]
3.17s call     tests/test_recipes.py::test_chunks[prefect-netCDFtoZarr_recipe-target_chunks6-False-error_expectation6-1-subset_inputs0-netcdf_local_file_pattern_sequential_multivariable-netcdf_local_paths_sequential_multivariable_1d]
3.14s call     tests/test_recipes.py::test_chunks[prefect-netCDFtoZarr_recipe-target_chunks5-False-error_expectation5-1-subset_inputs0-netcdf_local_file_pattern_sequential_multivariable-netcdf_local_paths_sequential_multivariable_1d]
2.97s call     tests/test_recipes.py::test_chunks[prefect-netCDFtoZarr_recipe-target_chunks2-True-error_expectation2-1-subset_inputs1-netcdf_local_file_pattern_sequential_multivariable-netcdf_local_paths_sequential_multivariable_1d]
2.94s call     tests/test_recipes.py::test_chunks[prefect-netCDFtoZarr_recipe-target_chunks1-True-error_expectation1-1-subset_inputs0-netcdf_local_file_pattern_sequential_multivariable-netcdf_local_paths_sequential_multivariable_1d]
2.90s call     tests/test_recipes.py::test_chunks[prefect-netCDFtoZarr_recipe-target_chunks1-True-error_expectation1-1-subset_inputs1-netcdf_local_file_pattern_sequential_multivariable-netcdf_local_paths_sequential_multivariable_1d]
2.90s call     tests/test_recipes.py::test_chunks[prefect-dask-netCDFtoZarr_recipe-target_chunks5-False-error_expectation5-1-subset_inputs1-netcdf_local_file_pattern_sequential_multivariable-netcdf_local_paths_sequential_multivariable_1d]
2.88s call     tests/test_recipes.py::test_chunks[prefect-netCDFtoZarr_recipe-target_chunks0-True-error_expectation0-1-subset_inputs1-netcdf_local_file_pattern_sequential_multivariable-netcdf_local_paths_sequential_multivariable_1d]
2.87s call     tests/test_recipes.py::test_chunks[prefect-netCDFtoZarr_recipe-target_chunks2-True-error_expectation2-1-subset_inputs0-netcdf_local_file_pattern_sequential_multivariable-netcdf_local_paths_sequential_multivariable_1d]
========== 1041 passed, 70 skipped, 465 warnings in 696.19s (0:11:36) ==========

@cisaacstern
Copy link
Member Author

cisaacstern commented Sep 3, 2021

@rabernat, is there a reason why we'd want each one of these param combinations of test_chunks

@pytest.mark.parametrize("inputs_per_chunk,subset_inputs", [(1, {}), (1, {"time": 2}), (2, {})])
@pytest.mark.parametrize(
"target_chunks,specify_nitems_per_input,error_expectation",
[
({}, True, does_not_raise()),
({"lon": 12}, True, does_not_raise()),
({"lon": 12, "time": 1}, True, does_not_raise()),
({"lon": 12, "time": 3}, True, does_not_raise()),
({"time": 10}, True, does_not_raise()), # only one big chunk
({"lon": 12, "time": 1}, False, does_not_raise()),
({"lon": 12, "time": 3}, False, does_not_raise()),
# can't determine target chunks for the next two because 'time' missing from target_chunks
({}, False, pytest.raises(ValueError)),
({"lon": 12}, False, pytest.raises(ValueError)),
],
)
@pytest.mark.parametrize("recipe_fixture", recipes_no_subset)
def test_chunks(

to be run with each of the 5 executors

@pytest.fixture(params=["manual", "python", "dask", "prefect", "prefect-dask"])
def execute_recipe(request, dask_cluster):

I believe chunking/subsetting behavior should be independent of executor? If we run test_chunks with just a single executor (manual, maybe, or python), I think we'd see dramatic speed ups in the test suite.

@rabernat
Copy link
Contributor

rabernat commented Sep 3, 2021

This test_chunks test actually does two important things:

  • It tests chunking options in XarrayZarrRecipe, verifying that the various options and combinations of parameters work as expected.
  • Since some of these options involve different store_chunk tasks writing to the same zarr chunks (again, ambiguous terminology collision 💥 ; Disambiguate naming of store_chunk/iter_chunks from target_chunks #182), it verifies that our distributed locking mechanisms work properly (and indeed, these tests write corrupted data when locking doesn't work, so that is very useful)

But it is an expensive test, and I agree that we should eliminate some parameterization. I recommend three steps.

  • Refactor the execute_recipe fixture to be four separate fixtures (allowing us to apply executors more selectively in the tests, rather than always using all of them), and then make combinations of them using lazy_fixture
  • Make test_chunks only execute with one executor (probably python, the simplest)
  • Create a new test called test_chunks_distributed_locking that is explicitly aimed at verifying the locking. This doesn't need parameterization; just use the option set ({"lon": 12, "time": 3}, False, does_not_raise()). (This produces write conflicts because the task chunks [either size 1 or 2] are smaller than the Zarr chunks [3])

@cisaacstern
Copy link
Member Author

I believe all the steps in #198 (comment) are now complete.

We're now at the point where almost half the test time is setting up miniconda. Will look into https://github.com/marketplace/actions/setup-miniconda#caching now to reduce that.

Copy link
Contributor

@rabernat rabernat left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome work Charles!

@rabernat rabernat merged commit c4a4ed8 into pangeo-forge:master Sep 3, 2021
@cisaacstern
Copy link
Member Author

To close the loop, here's where we landed with this PR, in terms of calls per test (plot below).

@rabernat,

def test_process(recipe_fixture, execute_recipe, process_input, process_chunk):
"""Check that the process_chunk and process_input arguments work as expected."""

is now the outlier. Just checked locally and if we run it with the new execute_recipe_python fixture only, we can save more than a minute, going from

Results (87.57s):
     120 passed

to

Results (10.40s):
      24 passed

any reason test_process needs to be run with all the executors?

PR-198

@cisaacstern
Copy link
Member Author

and the same could be asked of

def test_prune_recipe(recipe_fixture, execute_recipe, nkeep):
"""Check that recipe.copy_pruned works as expected."""

@rabernat
Copy link
Contributor

rabernat commented Sep 3, 2021

Very true. Our default approach for testing is basically to run every test with every recipe and every executor. This obviously creates a lot of tests. But the good thing is that we really verify that all features work with all the different executors. We could make the test suite a lot slimmer by removing all this executor parameterization...if we are sure we can trust the executors.

I would feel more comfortable doing that if we formalized the interface with the executors more. The pipelines model (#192) allows us to do that. If all of the recipes export pipelines, and then we trust the executors to execute the pipelines faithfully, then we can remove this parameterization. This would require the pipelines framework to be very well tested. Ideally, we would put pipelines into its own package, which both pangeo-forge-recipes and rechunker would depend on.

We can do these things and they will help. But they are probably not your top priority right now.

@cisaacstern
Copy link
Member Author

Makes a lot of sense. Thanks for the detailed reflections. If the dependencies caching looks like low-hanging fruit after a bit more reading, I may throw that in today, given just how long the miniconda setup takes. But I'll leave the parameter tweaking here for now, as I fully agree the path forward is through #192, not fiddling with the existing setup.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Tests slowed by > 2x following merge of #167
2 participants