Clean up the iloc in the tests #254

xiki-tempula · 2022-10-16T20:32:44Z

I have cleaned up most of the iloc in the test but there are a couple of tests that still use iloc. Most of them are parameterised tests where different datasets demand different column labels.

    @pytest.mark.parametrize(('data', 'size'), [(gmx_benzene_dHdl(), 4001),
                                                (gmx_benzene_u_nk(), 4001)])
    def test_subsampling(self, data, size):
        assert len(self.slicer(data, series=data.iloc[:, 0])) <= size

In this case, the first column is always chosen which have different column name for different datasets.

Another case is

    def get_delta_f(self, est):
        ee = 0.0

        for i in range(len(est.d_delta_f_) - 1):
            ee += est.d_delta_f_.values[i][i+1]**2
        return est.delta_f_.iloc[0, -1], ee**0.5

Where depending on the dataset, this could be est.delta_f_[0.0][1.0] or est.delta_f_.loc[(0.0,0.0)][(1.0,1.0)]

codecov · 2022-10-16T21:06:44Z

Codecov Report

Merging #254 (45c9e0a) into master (079a5b5) will not change coverage.
The diff coverage is n/a.

@@           Coverage Diff           @@
##           master     #254   +/-   ##
=======================================
  Coverage   98.69%   98.69%           
=======================================
  Files          26       26           
  Lines        1761     1761           
  Branches      379      379           
=======================================
  Hits         1738     1738           
  Misses          3        3           
  Partials       20       20

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

orbeckst · 2022-10-24T20:43:48Z

I think the remaining iloc cases are fine. Could you add a short comment stating why these are ilocs, just so that what you found now is not forgotten?

orbeckst · 2022-12-05T16:12:45Z

PLease resolve the conflicts and ping me when you need a review.

xiki-tempula · 2022-12-05T17:24:27Z

@orbeckst I have merged the master into this branch.

orbeckst

Thanks for the updates.

I have a bunch of suggestions & PEP8 fixes. Primarily, can we avoid df.loc[a][b] and instead write df.loc[a, b] instead?

I am approving but please do at least the PEP8 formatting. Thank you!

orbeckst · 2022-12-05T20:33:35Z

src/alchemlyb/tests/parsing/test_gmx.py

-                ds += u_nk.iloc[i][i]
+            for i, lambda_ in enumerate(u_nk.columns):
+                #18.6 is the time step
+                ds += u_nk.loc[i*186/10][lambda_].values[0]


If this is a proper .loc, can't we use

Suggested change

ds += u_nk.loc[i*186/10][lambda_].values[0]

ds += u_nk.loc[i*186/10, lambda_].values[0]

This is not a proper .loc.
the .loc[1.0] could in theory match both dataframe with row index (1.0, 0.0, 0.0) and (1.0).
But .loc[1.0, :] would only strictly match (1.0).

Ok, after looking at the data I see that this is a multiindex. But in this case wouldn't be

ds += u_nk.loc[i*186/10].loc[lambda_].values[0]

be cleaner?

(I have no idea how to get something like ".loc[lambda_, (0.0, 0.0)]" instead of the values[0] but I find pandas indexing confusing...)

Yes, but we cannot get lambda_ beforehand and the lambda_ could be (0.0, 0.0) or (0.0, 0.0, 0.0), which is why for some of them, I could only do iloc.
In some sense, we could get them by analysing the dataframe but that would make the unit test too complicated so I think the current form is a good balance.

orbeckst · 2022-12-05T20:34:16Z

src/alchemlyb/tests/parsing/test_gmx.py

@@ -124,7 +124,7 @@ def test_u_nk_with_total_energy():

    # Check one specific value in the dataframe
    assert_almost_equal(
-        extract_u_nk(dataset['data']['AllStates'][0], T=300).iloc[0][0],
+        extract_u_nk(dataset['data']['AllStates'][0], T=300).loc[0][(0.0,0.0)].values[0],


PEP8 and using .loc

Suggested change

extract_u_nk(dataset['data']['AllStates'][0], T=300).loc[0][(0.0,0.0)].values[0],

extract_u_nk(dataset['data']['AllStates'][0], T=300).loc[0, (0.0, 0.0)].values[0],

orbeckst · 2022-12-05T20:34:26Z

src/alchemlyb/tests/parsing/test_gmx.py

@@ -142,7 +142,7 @@ def test_u_nk_with_potential_energy():

    # Check one specific value in the dataframe
    assert_almost_equal(
-        extract_u_nk(dataset['data']['AllStates'][0], T=300).iloc[0][0],
+        extract_u_nk(dataset['data']['AllStates'][0], T=300).loc[0][(0.0,0.0)].values[0],


orbeckst · 2022-12-05T20:34:31Z

src/alchemlyb/tests/parsing/test_gmx.py

@@ -161,7 +161,7 @@ def test_u_nk_without_energy():

    # Check one specific value in the dataframe
    assert_almost_equal(
-        extract_u_nk(dataset['data']['AllStates'][0], T=300).iloc[0][0],
+        extract_u_nk(dataset['data']['AllStates'][0], T=300).loc[0][(0.0,0.0)].values[0],


orbeckst · 2022-12-05T20:36:40Z

src/alchemlyb/tests/test_fep_estimators.py

@@ -139,6 +139,8 @@ def compare_delta_f(self, X_delta_f):
        assert X_delta_f[2] == pytest.approx(d_delta_f, rel=1e-3)

    def get_delta_f(self, est):
+        # Use .iloc[0, -1] as we want to cater for both


thanks for the comment

orbeckst · 2022-12-05T20:39:03Z

src/alchemlyb/tests/test_units.py

@@ -60,7 +60,7 @@ def dhdl():

    def test_kt2kt_number(self, dhdl):
        new_dhdl = to_kT(dhdl)
-        assert 12.9 == pytest.approx(new_dhdl.iloc[0, 0], 0.1)
+        assert 12.9 == pytest.approx(new_dhdl.loc[(0.0,0.0)], 0.1)


pep8 space after comma

Suggested change

assert 12.9 == pytest.approx(new_dhdl.loc[(0.0,0.0)], 0.1)

assert 12.9 == pytest.approx(new_dhdl.loc[(0.0, 0.0)], 0.1)

orbeckst · 2022-12-05T20:39:10Z

src/alchemlyb/tests/test_units.py

@@ -74,7 +74,7 @@ def test_kj2kt_unit(self, dhdl):
    def test_kj2kt_number(self, dhdl):
        dhdl.attrs['energy_unit'] = 'kJ/mol'
        new_dhdl = to_kT(dhdl)
-        assert 5.0 == pytest.approx(new_dhdl.iloc[0, 0], 0.1)
+        assert 5.0 == pytest.approx(new_dhdl.loc[(0.0,0.0)], 0.1)


pep8 space after comma

orbeckst · 2022-12-05T20:39:16Z

src/alchemlyb/tests/test_units.py

@@ -84,7 +84,7 @@ def test_kcal2kt_unit(self, dhdl):
    def test_kcal2kt_number(self, dhdl):
        dhdl.attrs['energy_unit'] = 'kcal/mol'
        new_dhdl = to_kT(dhdl)
-        assert 21.0 == pytest.approx(new_dhdl.iloc[0, 0], 0.1)
+        assert 21.0 == pytest.approx(new_dhdl.loc[(0.0,0.0)], 0.1)


pep8 space after comma

orbeckst · 2022-12-05T20:39:26Z

src/alchemlyb/tests/test_visualisation.py

+        forward.append(estimate.delta_f_.loc[0.0,1.0])
+        forward_error.append(estimate.d_delta_f_.loc[0.0,1.0])


pep8 space after comma

orbeckst · 2022-12-05T20:39:35Z

src/alchemlyb/tests/test_visualisation.py

+        backward.append(estimate.delta_f_.loc[0.0,1.0])
+        backward_error.append(estimate.d_delta_f_.loc[0.0,1.0])


pep8 space after comma

xiki-tempula · 2022-12-05T20:48:25Z

@orbeckst For the PEP8, I guess I could just do a project-wide black, which is the same as the one for flamel?

orbeckst · 2022-12-05T20:50:42Z

Can we do black in a separate PR? Reformatting hides any relevant changes. For this PR, do the few fixes manually and then we can blackify in a separate PR.

xiki-tempula · 2022-12-05T20:53:53Z

@orbeckst Would it be easier if I merge this PR as it is. Then do a black PR?
For the .loc[0][0] related things, I cannot change them.
For the space after the ,, black will automatically sort them out.

orbeckst

that's fine then; run black later

pandas indexing is a bit confusing...

orbeckst · 2022-12-05T21:16:09Z

src/alchemlyb/tests/parsing/test_gmx.py

-                ds += u_nk.iloc[i][i]
+            for i, lambda_ in enumerate(u_nk.columns):
+                #18.6 is the time step
+                ds += u_nk.loc[i*186/10][lambda_].values[0]


Ok, after looking at the data I see that this is a multiindex. But in this case wouldn't be

ds += u_nk.loc[i*186/10].loc[lambda_].values[0]

be cleaner?

(I have no idea how to get something like ".loc[lambda_, (0.0, 0.0)]" instead of the values[0] but I find pandas indexing confusing...)

xiki-tempula added 6 commits August 14, 2022 20:56

remove iloc

f1a8ae2

update loc

afc54b8

Merge branch 'master' into feat_iloc

c623b14

update

3fd1005

update

7609dc2

ifx test

50bf8aa

xiki-tempula mentioned this pull request Oct 25, 2022

release 1.0 #208

Closed

6 tasks

xiki-tempula added 3 commits December 4, 2022 10:39

Merge branch 'master' into feat_iloc

e049ccb

update

e4c0a4d

update

6839b8b

xiki-tempula requested a review from orbeckst December 4, 2022 10:50

xiki-tempula marked this pull request as ready for review December 4, 2022 10:50

xiki-tempula added 5 commits December 4, 2022 18:32

update

91a3928

update

0d81a47

update

1afd2b9

update

1ab2c67

Merge branch 'feat_RTD' into feat_iloc

7d4cb67

xiki-tempula mentioned this pull request Dec 5, 2022

Use fixture for the test #278

Merged

Merge branch 'master' into feat_iloc

75c0972

orbeckst approved these changes Dec 5, 2022

View reviewed changes

orbeckst assigned xiki-tempula Dec 5, 2022

update the RTD part

45c9e0a

orbeckst approved these changes Dec 5, 2022

View reviewed changes

orbeckst self-assigned this Dec 6, 2022

orbeckst merged commit 380302d into alchemistry:master Dec 6, 2022

xiki-tempula deleted the feat_iloc branch December 6, 2022 10:34

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Clean up the iloc in the tests #254

Clean up the iloc in the tests #254

xiki-tempula commented Oct 16, 2022

codecov bot commented Oct 16, 2022 •

edited

Loading

orbeckst commented Oct 24, 2022

orbeckst commented Dec 5, 2022

xiki-tempula commented Dec 5, 2022

orbeckst left a comment

orbeckst Dec 5, 2022

xiki-tempula Dec 5, 2022

orbeckst Dec 5, 2022

xiki-tempula Dec 5, 2022

orbeckst Dec 5, 2022

orbeckst Dec 5, 2022

orbeckst Dec 5, 2022

orbeckst Dec 5, 2022

orbeckst Dec 5, 2022

orbeckst Dec 5, 2022

orbeckst Dec 5, 2022

orbeckst Dec 5, 2022

orbeckst Dec 5, 2022

xiki-tempula commented Dec 5, 2022

orbeckst commented Dec 5, 2022

xiki-tempula commented Dec 5, 2022

orbeckst left a comment

orbeckst Dec 5, 2022

	ds += u_nk.loc[i*186/10][lambda_].values[0]
	ds += u_nk.loc[i*186/10, lambda_].values[0]

	extract_u_nk(dataset['data']['AllStates'][0], T=300).loc[0][(0.0,0.0)].values[0],
	extract_u_nk(dataset['data']['AllStates'][0], T=300).loc[0, (0.0, 0.0)].values[0],

	assert 12.9 == pytest.approx(new_dhdl.loc[(0.0,0.0)], 0.1)
	assert 12.9 == pytest.approx(new_dhdl.loc[(0.0, 0.0)], 0.1)

		forward.append(estimate.delta_f_.loc[0.0,1.0])
		forward_error.append(estimate.d_delta_f_.loc[0.0,1.0])

		backward.append(estimate.delta_f_.loc[0.0,1.0])
		backward_error.append(estimate.d_delta_f_.loc[0.0,1.0])

Clean up the iloc in the tests #254

Clean up the iloc in the tests #254

Conversation

xiki-tempula commented Oct 16, 2022

codecov bot commented Oct 16, 2022 • edited Loading

Codecov Report

orbeckst commented Oct 24, 2022

orbeckst commented Dec 5, 2022

xiki-tempula commented Dec 5, 2022

orbeckst left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

xiki-tempula commented Dec 5, 2022

orbeckst commented Dec 5, 2022

xiki-tempula commented Dec 5, 2022

orbeckst left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

codecov bot commented Oct 16, 2022 •

edited

Loading