Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[TST] avoid evaluating data loader functions in parametrized tests at set-up time #206

Closed
2 tasks done
orbeckst opened this issue Jul 6, 2022 · 1 comment · Fixed by #278
Closed
2 tasks done
Labels

Comments

@orbeckst
Copy link
Member

orbeckst commented Jul 6, 2022

Problem

Our tests contain examples of parametrizing over different datasets by calling the data loader function in the pytest.mark.parametrize such as

@pytest.mark.parametrize(('data', 'size'), [(gmx_benzene_dHdl(), 661),
(gmx_benzene_u_nk(), 661)])
def test_basic_slicing(self, data, size):
assert len(self.slicer(data, lower=1000, upper=34000, step=5)) == size

The data loader functions gmx_benzene_dHdl() and gmx_benzene_u_nk() are called at the time when pytest collects tests. This slows down the test setup phase substantially (it's done in serial) and it also has the potential to fill up memory.

Solution

Make sure that the loader function is only evaluated inside the test.

Use pytest's getfixturevalue

Use request.getfixturevalue(fixture_name) to dynamically run the fixture function named fixture_name

Should probably look like

@pytest.mark.parametrize(('dataloader', 'size'), [('gmx_benzene_dHdl', 661), 
                                                  ('gmx_benzene_u_nk', 661)]) 
     def test_basic_slicing(self, dataloader, size, request):
         data = request.getfixturevalue(dataloader)
         assert len(self.slicer(data, lower=1000, upper=34000, step=5)) == size 

Note that in the example above, the dataloader is the name of a pytest fixture and not an ordinary function (as currently implemented in our tests).

Current hacky solution in alchemlyb

Instead, only pass the function objects to a parametrized fixture and then evaluate inside the parametrized fixture itself, as shown, for example in

@pytest.fixture(scope="class",
params=[(gmx_benzene_coul_u_nk, 3.041, 0.02088),
(gmx_benzene_vdw_u_nk, -3.007, 0.04519),
(gmx_expanded_ensemble_case_1, 75.923, 0.14124),
(gmx_expanded_ensemble_case_2, 75.915, 0.14372),
(gmx_expanded_ensemble_case_3, 76.173, 0.11345),
(gmx_water_particle_with_total_energy, -11.680, 0.083655),
(gmx_water_particle_with_potential_energy, -11.675, 0.083589),
(gmx_water_particle_without_energy, -11.654, 0.083415),
(amber_bace_example_complex_vdw, 2.40200, 0.0618453),
(gomc_benzene_u_nk, -0.79994, 0.091579),
])
def X_delta_f(self, request):
get_unk, E, dE = request.param
return get_unk(), E, dE
def test_mbar(self, X_delta_f):
self.compare_delta_f(X_delta_f)
This approach ensures that data is loaded when needed and can be done in parallel.

TODO

  • find all instances of up-front data loading
  • replace with code that uses getfixturevalue
@orbeckst
Copy link
Member Author

orbeckst commented Jul 7, 2022

Note (from #199 (comment) ) that the data-loading functions will have to be converted to fixtures and then won't be usable as functions anymore.

This was referenced Dec 4, 2022
orbeckst pushed a commit that referenced this issue Dec 6, 2022
#278)

- fix #206 
- speed up tests by avoiding to evaluate fixtures at test collection time
- reorganized fixtures: fixtures that load data are now global in `conftest.py`
- updated tests
- updated CHANGES
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant