Optimize tests following #167 slowdown #198

cisaacstern · 2021-09-02T21:12:39Z

To start I've just added a module parametrization_counter.py which returns diagnostic information on test runs.

More to follow. Closes #199 when complete.

cisaacstern · 2021-09-03T01:25:31Z

#199 (comment) documents speed-ups in test_storage.py::test_file_opener after limiting input fixtures to sequential file paths only. Picking up that thread here with some insights on test_recipes.py::test_recipe_caching_copying following the same change.

From our current master

$ pytest test_recipes.py -k test_recipe_caching_copying
Results (252.91s):
     320 passed
     835 deselected

And from 076ed16

$ pytest test_recipes.py -k test_recipe_caching_copying
Results (103.09s):
     160 passed
     835 deselected

cisaacstern · 2021-09-03T02:18:59Z

@rabernat, this PR shaves 15 mins off 3.8-build and 10 mins of the 3.9-build compared to master (where each is ~30min):

We get there by refactoring the fixtures so we can use only sequential files/paths/patterns for the tests of moving files around, thereby cutting out a few hundred unnecessary tests. Here are the numbers of calls in each module, by PR:

module	#174	#167	#198
test_chunk_grid	6	8	8
test_fixtures	6	16	22
test_locking	6	6	6
test_patterns	6	22	22
test_recipes	803	1155	995
test_references	8	8	8
test_storage	99	387	195
test_utils	2	2	2

And a visual look at test calls from the current PR (xref plot of test calls for master in #199 (comment)):

test_recipes.py::test_chunks is still the major outlier, but there doesn't seem to be a way to reduce those calls by changing the recipe fixture, since it only uses local paths at it is, and it doesn't seem appropriate to deprive that test of multivariable files.

This seems to be good enough progress for now, without making this into a multi-day exercise, so I'm good to merge if you are. Also happy to dig a little deeper, though, if you see any other obvious places for improvement.

If we merge, do you want to commit parametrization_counter.py, which I used to make those plots? I'd probably use it if it was there, but if not I'll drop it in a gist. (It wasn't clear to me if any pytest or third-party options already existed for making visual diagnostics like that.)

cisaacstern · 2021-09-03T03:09:40Z

On further reflection, I thought of a way to possibly cut the file transfer testing time in half again. I'll push that update later tonight or first thing tomorrow.

cisaacstern · 2021-09-03T08:35:05Z

We're now basically back to where we were before #167 in terms of testing time and IMHO this is now ready to merge:

Previously when I mentioned testing only the "sequential" files, both the "D" and "2D" interval files were included in that. Considering each http run requires three auth parameterizations (no auth, basic auth, query string auth), even that little bit of redundancy was adding up.

In the last few commits, I refactored the fixtures again to make this items_per_file option explicit in the local paths, thereby exposing just the "D" interval files to make our http path/pattern/recipe fixtures. I think I'm convinced that we don't need to test any other files aside from these over http, because the purpose of the http tests is really to ensure file opening/caching/authentication/etc. works, and we actually don't care about the content of the files in those tests (only that they get where they're going intact).

Similarly, I did change test_storage.py::test_file_opener so that it only runs with the local sequential files. This is a marginal speed increase over including all of the local files, but thought I'd go for it while I was at it. Again, IIUC, for file opening/transfer/etc. tests, diversity of internal contents in the files being opened and transferred shouldn't matter. (At least, for files of roughly the same size and format.)

do you want to commit parametrization_counter.py

Moved this out of the PR to https://github.com/cisaacstern/pytest-param-counter. And just because I like that plot, here's were we stand:

cisaacstern · 2021-09-03T08:45:44Z

Noting that this also turns on duration profiling for github pytest runs (per Ryan's suggestion) and that all 10 of the slowest tests are now some version of tests/test_recipes.py::test_chunks[prefect..., which was not the case prior to the work in this PR, as documented in #199 (comment). Working on this is out-of-scope for this PR, IMO, but good to keep in mind for further optimization down the line:

============================= slowest 10 durations =============================
3.30s call     tests/test_recipes.py::test_chunks[prefect-netCDFtoZarr_recipe-target_chunks6-False-error_expectation6-1-subset_inputs1-netcdf_local_file_pattern_sequential_multivariable-netcdf_local_paths_sequential_multivariable_1d]
3.17s call     tests/test_recipes.py::test_chunks[prefect-netCDFtoZarr_recipe-target_chunks5-False-error_expectation5-1-subset_inputs1-netcdf_local_file_pattern_sequential_multivariable-netcdf_local_paths_sequential_multivariable_1d]
3.17s call     tests/test_recipes.py::test_chunks[prefect-netCDFtoZarr_recipe-target_chunks6-False-error_expectation6-1-subset_inputs0-netcdf_local_file_pattern_sequential_multivariable-netcdf_local_paths_sequential_multivariable_1d]
3.14s call     tests/test_recipes.py::test_chunks[prefect-netCDFtoZarr_recipe-target_chunks5-False-error_expectation5-1-subset_inputs0-netcdf_local_file_pattern_sequential_multivariable-netcdf_local_paths_sequential_multivariable_1d]
2.97s call     tests/test_recipes.py::test_chunks[prefect-netCDFtoZarr_recipe-target_chunks2-True-error_expectation2-1-subset_inputs1-netcdf_local_file_pattern_sequential_multivariable-netcdf_local_paths_sequential_multivariable_1d]
2.94s call     tests/test_recipes.py::test_chunks[prefect-netCDFtoZarr_recipe-target_chunks1-True-error_expectation1-1-subset_inputs0-netcdf_local_file_pattern_sequential_multivariable-netcdf_local_paths_sequential_multivariable_1d]
2.90s call     tests/test_recipes.py::test_chunks[prefect-netCDFtoZarr_recipe-target_chunks1-True-error_expectation1-1-subset_inputs1-netcdf_local_file_pattern_sequential_multivariable-netcdf_local_paths_sequential_multivariable_1d]
2.90s call     tests/test_recipes.py::test_chunks[prefect-dask-netCDFtoZarr_recipe-target_chunks5-False-error_expectation5-1-subset_inputs1-netcdf_local_file_pattern_sequential_multivariable-netcdf_local_paths_sequential_multivariable_1d]
2.88s call     tests/test_recipes.py::test_chunks[prefect-netCDFtoZarr_recipe-target_chunks0-True-error_expectation0-1-subset_inputs1-netcdf_local_file_pattern_sequential_multivariable-netcdf_local_paths_sequential_multivariable_1d]
2.87s call     tests/test_recipes.py::test_chunks[prefect-netCDFtoZarr_recipe-target_chunks2-True-error_expectation2-1-subset_inputs0-netcdf_local_file_pattern_sequential_multivariable-netcdf_local_paths_sequential_multivariable_1d]
========== 1041 passed, 70 skipped, 465 warnings in 696.19s (0:11:36) ==========

cisaacstern · 2021-09-03T16:25:23Z

@rabernat, is there a reason why we'd want each one of these param combinations of test_chunks

pangeo-forge-recipes/tests/test_recipes.py

Lines 188 to 205 in e748feb

    
           @pytest.mark.parametrize("inputs_per_chunk,subset_inputs", [(1, {}), (1, {"time": 2}), (2, {})]) 
        
           @pytest.mark.parametrize( 
        
               "target_chunks,specify_nitems_per_input,error_expectation", 
        
               [ 
        
                   ({}, True, does_not_raise()), 
        
                   ({"lon": 12}, True, does_not_raise()), 
        
                   ({"lon": 12, "time": 1}, True, does_not_raise()), 
        
                   ({"lon": 12, "time": 3}, True, does_not_raise()), 
        
                   ({"time": 10}, True, does_not_raise()),  # only one big chunk 
        
                   ({"lon": 12, "time": 1}, False, does_not_raise()), 
        
                   ({"lon": 12, "time": 3}, False, does_not_raise()), 
        
                   # can't determine target chunks for the next two because 'time' missing from target_chunks 
        
                   ({}, False, pytest.raises(ValueError)), 
        
                   ({"lon": 12}, False, pytest.raises(ValueError)), 
        
               ], 
        
           ) 
        
           @pytest.mark.parametrize("recipe_fixture", recipes_no_subset) 
        
           def test_chunks(

to be run with each of the 5 executors

pangeo-forge-recipes/tests/conftest.py

Lines 331 to 332 in e748feb

    
           @pytest.fixture(params=["manual", "python", "dask", "prefect", "prefect-dask"]) 
        
           def execute_recipe(request, dask_cluster):

I believe chunking/subsetting behavior should be independent of executor? If we run test_chunks with just a single executor (manual, maybe, or python), I think we'd see dramatic speed ups in the test suite.

… fixture

rabernat · 2021-09-03T17:28:43Z

This test_chunks test actually does two important things:

It tests chunking options in XarrayZarrRecipe, verifying that the various options and combinations of parameters work as expected.
Since some of these options involve different store_chunk tasks writing to the same zarr chunks (again, ambiguous terminology collision 💥 ; Disambiguate naming of store_chunk/iter_chunks from target_chunks #182), it verifies that our distributed locking mechanisms work properly (and indeed, these tests write corrupted data when locking doesn't work, so that is very useful)

But it is an expensive test, and I agree that we should eliminate some parameterization. I recommend three steps.

Refactor the execute_recipe fixture to be four separate fixtures (allowing us to apply executors more selectively in the tests, rather than always using all of them), and then make combinations of them using lazy_fixture
Make test_chunks only execute with one executor (probably python, the simplest)
Create a new test called test_chunks_distributed_locking that is explicitly aimed at verifying the locking. This doesn't need parameterization; just use the option set ({"lon": 12, "time": 3}, False, does_not_raise()). (This produces write conflicts because the task chunks [either size 1 or 2] are smaller than the Zarr chunks [3])

cisaacstern · 2021-09-03T18:46:32Z

I believe all the steps in #198 (comment) are now complete.

We're now at the point where almost half the test time is setting up miniconda. Will look into https://github.com/marketplace/actions/setup-miniconda#caching now to reduce that.

rabernat

Awesome work Charles!

cisaacstern · 2021-09-03T19:13:51Z

To close the loop, here's where we landed with this PR, in terms of calls per test (plot below).

@rabernat,

pangeo-forge-recipes/tests/test_recipes.py

Lines 189 to 190 in c4a4ed8

    
           def test_process(recipe_fixture, execute_recipe, process_input, process_chunk): 
        
               """Check that the process_chunk and process_input arguments work as expected."""

is now the outlier. Just checked locally and if we run it with the new execute_recipe_python fixture only, we can save more than a minute, going from

Results (87.57s):
     120 passed

to

Results (10.40s):
      24 passed

any reason test_process needs to be run with all the executors?

cisaacstern · 2021-09-03T19:15:46Z

and the same could be asked of

pangeo-forge-recipes/tests/test_recipes.py

Lines 136 to 137 in c4a4ed8

    
           def test_prune_recipe(recipe_fixture, execute_recipe, nkeep): 
        
               """Check that recipe.copy_pruned works as expected."""

rabernat · 2021-09-03T19:23:25Z

Very true. Our default approach for testing is basically to run every test with every recipe and every executor. This obviously creates a lot of tests. But the good thing is that we really verify that all features work with all the different executors. We could make the test suite a lot slimmer by removing all this executor parameterization...if we are sure we can trust the executors.

I would feel more comfortable doing that if we formalized the interface with the executors more. The pipelines model (#192) allows us to do that. If all of the recipes export pipelines, and then we trust the executors to execute the pipelines faithfully, then we can remove this parameterization. This would require the pipelines framework to be very well tested. Ideally, we would put pipelines into its own package, which both pangeo-forge-recipes and rechunker would depend on.

We can do these things and they will help. But they are probably not your top priority right now.

cisaacstern · 2021-09-03T19:28:09Z

Makes a lot of sense. Thanks for the detailed reflections. If the dependencies caching looks like low-hanging fruit after a bit more reading, I may throw that in today, given just how long the miniconda setup takes. But I'll leave the parameter tweaking here for now, as I fully agree the path forward is through #192, not fiddling with the existing setup.

cisaacstern added 15 commits September 2, 2021 13:13

param counter module with gridded plots

eecffa9

plot all tests on single axis

116c469

add duration profiling to pytest workflow

7b24afa

lint

724d78f

lint

2cb88d3

group similar objects together in conftest.py

b779c05

factor netcdf_http_paths into public and private fixtures

6d8944a

remove skips from test_fixtures.py

93e8dbd

add netcdf_http_paths_sequential_only fixture

bd6584b

only pass sequential files to test_file_opener

3039955

lint

a6334a4

add test for sequential_only http paths

309a8cd

add http sequential_only file pattern fixture

e105842

run test_recipe_caching_copying with sequential paths only

076ed16

lint

3e01fc3

cisaacstern marked this pull request as ready for review September 3, 2021 02:01

cisaacstern requested a review from rabernat September 3, 2021 02:01

cisaacstern added 6 commits September 3, 2021 00:51

remove parametrization_counter.py

82b78dd

refactor fixtures: make items_per_file explicit, fewer http fixtures

63fab27

update test_fixtures.py for fewer http fixtures

435343b

update params for test_file_opener: http 1d sequential only

756b4d2

update test_recipe_caching_copying: http 1d sequential only

1ddb3f8

lint

5a2d29f

factor execution fixture into separate fixtures, then reassemble lazy…

f029aa8

… fixture

cisaacstern added 2 commits September 3, 2021 10:16

run test_chunks with python executor only

e71a3b7

lint

45ceb32

cisaacstern added 6 commits September 3, 2021 10:56

add no dask executor fixture

1400002

run test_lock_timeout without dask

6d8dad6

define do_actual_chunks_test function and pass to test_chunks

a7d54c2

add with_dask executor fixture

b466dfa

add test_chunks_distributed_locking test and call w/ with_dask fixture

fb31f7d

lint

a6737b9

rabernat approved these changes Sep 3, 2021

View reviewed changes

rabernat merged commit c4a4ed8 into pangeo-forge:master Sep 3, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Optimize tests following #167 slowdown #198

Optimize tests following #167 slowdown #198

cisaacstern commented Sep 2, 2021 •

edited

Loading

cisaacstern commented Sep 3, 2021

cisaacstern commented Sep 3, 2021 •

edited

Loading

cisaacstern commented Sep 3, 2021

cisaacstern commented Sep 3, 2021

cisaacstern commented Sep 3, 2021 •

edited

Loading

cisaacstern commented Sep 3, 2021 •

edited

Loading

rabernat commented Sep 3, 2021

cisaacstern commented Sep 3, 2021

rabernat left a comment

cisaacstern commented Sep 3, 2021

cisaacstern commented Sep 3, 2021

rabernat commented Sep 3, 2021

cisaacstern commented Sep 3, 2021

Optimize tests following #167 slowdown #198

Optimize tests following #167 slowdown #198

Conversation

cisaacstern commented Sep 2, 2021 • edited Loading

cisaacstern commented Sep 3, 2021

cisaacstern commented Sep 3, 2021 • edited Loading

cisaacstern commented Sep 3, 2021

cisaacstern commented Sep 3, 2021

cisaacstern commented Sep 3, 2021 • edited Loading

cisaacstern commented Sep 3, 2021 • edited Loading

rabernat commented Sep 3, 2021

cisaacstern commented Sep 3, 2021

rabernat left a comment

Choose a reason for hiding this comment

cisaacstern commented Sep 3, 2021

cisaacstern commented Sep 3, 2021

rabernat commented Sep 3, 2021

cisaacstern commented Sep 3, 2021

cisaacstern commented Sep 2, 2021 •

edited

Loading

cisaacstern commented Sep 3, 2021 •

edited

Loading

cisaacstern commented Sep 3, 2021 •

edited

Loading

cisaacstern commented Sep 3, 2021 •

edited

Loading