Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

REF/TST: Stop using singleton fixtures #24769

Closed
wants to merge 14 commits into from

Conversation

jbrockmendel
Copy link
Member

Found when trying to collect reduction tests that we are still using a lot of singleton fixtures (agreed undesirable in #23701). This gets rid of singleton fixtures in tests.frame

Following this it will be easier to more usefully parametrize some of those tests.

@gfyoung gfyoung added Testing pandas testing functions or related to the test suite Clean labels Jan 14, 2019
@gfyoung
Copy link
Member

gfyoung commented Jan 14, 2019

we are still using a lot of singleton fixtures (agreed undesirable

@jbrockmendel : Indeed, too bad we can't lint these very easily AFAICT...

@jbrockmendel
Copy link
Member Author

The test failures look very weird to me. @gfyoung any ideas?

@gfyoung
Copy link
Member

gfyoung commented Jan 14, 2019

The test failures look very weird to me.

@jbrockmendel : I see only one test failure regarding your refactoring of test_rename_inplace. Not sure exactly why it's failing now on those platforms, but the failure doesn't seem completely weird to me. It's seem to be suggesting that the .copy somehow didn't actually create a copy of the object.

Copy link
Contributor

@jreback jreback left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

really? this is the entire point of fixtures. pls show an example of how this actually helps

@jbrockmendel
Copy link
Member Author

really? this is the entire point of fixtures.

@jreback we discussed this back in #23701 and you agreed "if it’s truly a singleton the. no a fixture is not warranted". Pls see the discussion there.

@jreback
Copy link
Contributor

jreback commented Jan 14, 2019

@jreback we discussed this back in #23701 and you agreed "if it’s truly a singleton the. no a fixture is not warranted". Pls see the discussion there.

key poiint. IF its truly a singleton. These are used all over the place, or at least should be. There are many many other files that actually just use TestData that should simply use these fixtures. They just have not been converted.

@TomAugspurger
Copy link
Contributor

Are you two using different definitions of singleton? @jbrockmendel's is (roughly, ignoring things like setup / teardown that don't apply to these), "a non-parametrized fixture". In this case, using a helper method in tm. to generate the data seems equivalent to accepting the data as a fixture.

@jreback's definition might be "a fixture that's only used in a single test"?

@jbrockmendel
Copy link
Member Author

@TomAugspurger thanks, I think you've got it right on the terminology mismatch.

Even if a fixture is used in many places, for one-liners or otherwise non-parametrized fixtures, we're better off inlining or putting them in pd.util.testing like this PR does. We get improved copy/paste-ability, reduce pytest lockin, and improve clarity. As a reader, which of these is clearer:

    def test_td64ser_div_numeric_scalar(self, tdser):
        expected = Series(['29.5D', '29.5D', 'NaT'], dtype='timedelta64[ns]')
        result = tdser / 2
        tm.assert_equal(result, expected)

    def test_td64ser_div_numeric_scalar(self):
        tdser = pd.Series(['59 Days', '59 Days', 'NaT'], dtype='timedelta64[ns]')
        expected = Series(['29.5D', '29.5D', 'NaT'], dtype='timedelta64[ns]')
        result = tdser / 2
        tm.assert_equal(result, expected)

I claim the latter is clearer by a wide margin.

@codecov
Copy link

codecov bot commented Jan 14, 2019

Codecov Report

Merging #24769 into master will increase coverage by <.01%.
The diff coverage is 100%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master   #24769      +/-   ##
==========================================
+ Coverage   92.38%   92.39%   +<.01%     
==========================================
  Files         166      166              
  Lines       52363    52400      +37     
==========================================
+ Hits        48377    48413      +36     
- Misses       3986     3987       +1
Flag Coverage Δ
#multiple 90.81% <100%> (ø) ⬆️
#single 42.9% <21.62%> (-0.01%) ⬇️
Impacted Files Coverage Δ
pandas/util/testing.py 88.4% <100%> (+0.3%) ⬆️
pandas/core/dtypes/common.py 96.78% <0%> (ø) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 6d3565a...032a74e. Read the comment docs.

@codecov
Copy link

codecov bot commented Jan 14, 2019

Codecov Report

Merging #24769 into master will increase coverage by <.01%.
The diff coverage is 100%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master   #24769      +/-   ##
==========================================
+ Coverage   92.38%   92.39%   +<.01%     
==========================================
  Files         166      166              
  Lines       52402    52439      +37     
==========================================
+ Hits        48414    48450      +36     
- Misses       3988     3989       +1
Flag Coverage Δ
#multiple 90.81% <100%> (ø) ⬆️
#single 42.88% <21.62%> (-0.01%) ⬇️
Impacted Files Coverage Δ
pandas/util/testing.py 88.43% <100%> (+0.3%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update fa12b9e...3b87f34. Read the comment docs.

@simonjayhawkins
Copy link
Member

@jbrockmendel

The test failures look very weird to me. @gfyoung any ideas?

if the only change to this test is replacing the fixture then it could be a change to N or K in tm.

@simonjayhawkins
Copy link
Member

in cases like this pytest --showlocals would be helpful. you can use print statements to debug and the output should be displayed for failing tests.. that's the default locally.. haven't tried it with the ci

@jbrockmendel
Copy link
Member Author

I can't reproduce this locally. Can anyone else?

@jbrockmendel
Copy link
Member Author

If somehow the test behavior depends on whether a fixture is used, that would be troubling

@TomAugspurger
Copy link
Contributor

TomAugspurger commented Jan 15, 2019

I can reproduce the failure in 032a74e locally. And on 2253bdf

@TomAugspurger
Copy link
Contributor

Interestingly, it seems to be a flaky test

 $ pytest pandas/tests/frame/test_alter_axes.py -k test_rename_inplace --count=5
================================================================================ test session starts ================================================================================
platform darwin -- Python 3.7.2, pytest-4.1.0, py-1.6.0, pluggy-0.7.1
rootdir: /Users/taugspurger/sandbox/pandas, inifile: setup.cfg
plugins: xdist-1.23.2, repeat-0.7.0, forked-0.2, faulthandler-1.5.0, cov-2.6.0, hypothesis-3.59.1
collected 3735 items / 3730 deselected

pandas/tests/frame/test_alter_axes.py .....                                                                                                                                   [100%]

===================================================================== 5 passed, 3730 deselected in 0.60 seconds =====================================================================

 $ pytest pandas/tests/frame/test_alter_axes.py -k test_rename_inplace --count=5
================================================================================ test session starts ================================================================================
platform darwin -- Python 3.7.2, pytest-4.1.0, py-1.6.0, pluggy-0.7.1
rootdir: /Users/taugspurger/sandbox/pandas, inifile: setup.cfg
plugins: xdist-1.23.2, repeat-0.7.0, forked-0.2, faulthandler-1.5.0, cov-2.6.0, hypothesis-3.59.1
collected 3735 items / 3730 deselected

pandas/tests/frame/test_alter_axes.py F....                                                                                                                                   [100%]

===================================================================================== FAILURES ======================================================================================
__________________________________________________________________ TestDataFrameAlterAxes.test_rename_inplace[1/5] __________________________________________________________________

self = <pandas.tests.frame.test_alter_axes.TestDataFrameAlterAxes object at 0x10ea03eb8>

    def test_rename_inplace(self):
        float_frame = DataFrame(tm.getSeriesData())

        float_frame.rename(columns={'C': 'foo'})
        assert 'C' in float_frame
        assert 'foo' not in float_frame

        c_id = id(float_frame['C'])
        float_frame = float_frame.copy()
        float_frame.rename(columns={'C': 'foo'}, inplace=True)

        assert 'C' not in float_frame
        assert 'foo' in float_frame
>       assert id(float_frame['foo']) != c_id
E       assert 4519856224 != 4519856224
E        +  where 4519856224 = id(3iGk6Mrg5P   -0.684722\nafMjhpeaJk   -1.726612\nlHGFsvRhY6    0.868385\n44rJSOpKPB    0.413032\ndBoNJu9vYI    0.017283\nHT4...4\ngwBsqTWequ    0.262972\nmBUNfZKGWs   -0.194150\nxNhGyHIH99    1.499986\nXnHXz6qUoM   -0.264342\nName: foo, dtype: float64)

pandas/tests/frame/test_alter_axes.py:693: AssertionError
================================================================ 1 failed, 4 passed, 3730 deselected in 0.85 seconds ================================================================

Just had 6/100 fail.

@TomAugspurger
Copy link
Contributor

On the PR itself, I personally don't care about copy/paste-ability, and am not too concerned with pytest lock-in.

I do care about how long the test suite takes, so reducing that without sacrificing readability / maintainability is a win. And to me, a simple fixture that just provides a value (no setup / teardown) is equivalent to a method call inside the test.

@jreback
Copy link
Contributor

jreback commented Jan 15, 2019

i am -1 on this change - this goes against the entire maxim of using fixtures

this just puts more distance between what is being tested and construction

this is just reinventing the pytest and is a giant leap backwards

Copy link
Contributor

@jreback jreback left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

as commented

@TomAugspurger
Copy link
Contributor

this just puts more distance between what is being tested and construction

How are you measuring distance? It's just a function call either way. And, depending on your editor, this may be less "distance" since goto definition typically works for function calls, but not necessarily for fixtures.

@jbrockmendel
Copy link
Member Author

this just puts more distance between what is being tested and construction

Um... the opposite of that. In many cases the way the PR does it doesn't even have to call into tm, just inlines the construction.

What it does do is decrease the ambiguity about where something is defined.

this is just reinventing the pytest and is a giant leap backwards

Since when did pytest invent inlining code or function calls?

By this logic, every object we use in more than one test should be made into a fixture. Where would you draw the line and why?

I do care about how long the test suite takes, so reducing that without sacrificing readability / maintainability is a win. And to me, a simple fixture that just provides a value (no setup / teardown) is equivalent to a method call inside the test.

For the places where the fixtures are inlined instead of moved into tm, there is a strict decrease in the function call overhead (though I doubt it is big enough to be measurable). For others it should balance out exactly.

so reducing that without sacrificing readability / maintainability is a win

I find the readability strictly improved by this. It also makes it easier to figure out what refactoring opportunities are available, which has other maintainability/perf upside.


In the status quo there are a handful of fixtures in tests.frame.conftest that aren't used at all. In other cases tests have a fixture in the arguments but never actually use it; the linting doesn't detect that.

@TomAugspurger
Copy link
Contributor

For the places where the fixtures are inlined instead of moved into tm, there is a strict decrease in the function call overhead (though I doubt it is big enough to be measurable). For others it should balance out exactly.

Just to be clear, I was talking about reduced setup time from avoiding fixtures (btw, do you have rough at how much this shaves off? I suppose a pytest --collect-only would show it?)

@jbrockmendel
Copy link
Member Author

Just to be clear, I was talking about reduced setup time from avoiding fixtures (btw, do you have rough at how much this shaves off? I suppose a pytest --collect-only would show it?)

N=5 they are indistinguishable.

Incidentally, for me the collection takes about 12% of test runtime (with parallelization). I think some of the way to cut down on that is to push back on over-parametrization.

@simonjayhawkins
Copy link
Member

just had this ci failure..

    def test_setindex(self, string_series):
        # wrong type
        msg = (r"Index\(\.\.\.\) must be called with a collection of some"
               r" kind, None was passed")
        with pytest.raises(TypeError, match=msg):
            string_series.index = None
    
        # wrong length
        msg = (r"Length mismatch: Expected axis has 30 elements, new values"
               r" have 29 elements")
        with pytest.raises(ValueError, match=msg):
>           string_series.index = np.arange(len(string_series) - 1)
E           AssertionError: Pattern 'Length mismatch: Expected axis has 30 elements, new values have 29 elements' not found in 'Length mismatch: Expected axis has 100 elements, new values have 99 elements'
pandas/tests/series/test_alter_axes.py:28: AssertionError
___________ TestSeriesConstructors.test_constructor_dtype_datetime64 ___________

so I think that tm.N and tm.K take on different values depending on the order that the tests are run.

@TomAugspurger
Copy link
Contributor

Seems like we have tests that set those, which seems iffy

$ ag 'tm.N ='
doc/source/reshaping.rst
18:   tm.N = 3
43:   tm.N = 3

pandas/tests/test_multilevel.py
50:        tm.N = 100

pandas/tests/plotting/test_datetimelike.py
317:        tm.N = 300
319:        tm.N = n

pandas/tests/indexing/multiindex/conftest.py
24:    tm.N = 100

@jbrockmendel
Copy link
Member Author

@TomAugspurger agreed that altering tm.K and tm.N is not great. But I don't see how that would change the affected test's behavior, do you?

@simonjayhawkins
Copy link
Member

Interestingly, it seems to be a flaky test

float-frame defined in function..

$ pytest pandas/tests/frame/test_alter_axes.py -k test_rename_inplace --count=100 --tb=no
============================= test session starts =============================
platform win32 -- Python 3.7.2, pytest-4.1.1, py-1.7.0, pluggy-0.8.1
hypothesis profile 'ci' -> deadline=500, timeout=unlimited, suppress_health_check=[HealthCheck.too_slow], database=DirectoryBasedExampleDatabase('C:\\Users\\simon\\OneDrive\\code\\pandas-simonjayhawkins\\.hypothesis\\examples')
rootdir: C:\Users\simon\OneDrive\code\pandas-simonjayhawkins, inifile: setup.cfg
plugins: xdist-1.26.0, repeat-0.7.0, forked-0.2, cov-2.6.1, hypothesis-4.0.1
collected 49500 items / 49400 deselected

pandas\tests\frame\test_alter_axes.py .F.....FF.....FFF......F.......FFF [ 34%]
......F.......FFF......F.......FF......F.....F.....F....F...F.....       [100%]

=========== 22 failed, 78 passed, 49400 deselected in 13.29 seconds ===========

float-frame fixture..

$ pytest pandas/tests/frame/test_alter_axes.py -k test_rename_inplace --count=100 --tb=no
============================= test session starts =============================
platform win32 -- Python 3.7.2, pytest-4.1.1, py-1.7.0, pluggy-0.8.1
hypothesis profile 'ci' -> deadline=500, timeout=unlimited, suppress_health_check=[HealthCheck.too_slow], database=DirectoryBasedExampleDatabase('C:\\Users\\simon\\OneDrive\\code\\pandas-simonjayhawkins\\.hypothesis\\examples')
rootdir: C:\Users\simon\OneDrive\code\pandas-simonjayhawkins, inifile: setup.cfg
plugins: xdist-1.26.0, repeat-0.7.0, forked-0.2, cov-2.6.1, hypothesis-4.0.1
collected 49500 items / 49400 deselected

pandas\tests\frame\test_alter_axes.py .................................. [ 34%]
..................................................................       [100%]

================ 100 passed, 49400 deselected in 13.20 seconds ================

change code to ...

    def test_rename_inplace(self):
        float_frame = DataFrame(tm.getSeriesData())

        float_frame.rename(columns={'C': 'foo'})
        assert 'C' in float_frame
        assert 'foo' not in float_frame

        c_id = id(float_frame['C'])
        float_frame2 = float_frame.copy()
        float_frame2.rename(columns={'C': 'foo'}, inplace=True)

        assert 'C' not in float_frame2
        assert 'foo' in float_frame2
        assert id(float_frame2['foo']) != c_id
$ pytest pandas/tests/frame/test_alter_axes.py -k test_rename_inplace --count=100 --tb=no
============================= test session starts =============================
platform win32 -- Python 3.7.2, pytest-4.1.1, py-1.7.0, pluggy-0.8.1
hypothesis profile 'ci' -> deadline=500, timeout=unlimited, suppress_health_check=[HealthCheck.too_slow], database=DirectoryBasedExampleDatabase('C:\\Users\\simon\\OneDrive\\code\\pandas-simonjayhawkins\\.hypothesis\\examples')
rootdir: C:\Users\simon\OneDrive\code\pandas-simonjayhawkins, inifile: setup.cfg
plugins: xdist-1.26.0, repeat-0.7.0, forked-0.2, cov-2.6.1, hypothesis-4.0.1
collected 49500 items / 49400 deselected

pandas\tests\frame\test_alter_axes.py .................................. [ 34%]
..................................................................       [100%]

================ 100 passed, 49400 deselected in 11.97 seconds ================
(pandas-dev)

so it appears to be due to re-use of memory space after dereferencing

@h-vetinari
Copy link
Contributor

@jreback @jbrockmendel @simonjayhawkins
There reason most of the fixtures coming out of TestData are singletons at all is that there is an ongoing (cough) transition from the old TestData-attributes-as-fixtures towards proper fixtures, see #22236 (where @jreback asked me to open the following) #22471 #22550.

I haven't touched these issues that much anymore, since it was supposed to be a "good first issue" (and had some response along those lines for a while), and I was busy with other PRs.

@simonjayhawkins
Copy link
Member

100/100 pass rate with

    @pytest.mark.parametrize('float_frame', [DataFrame(tm.getSeriesData())])
    def test_rename_inplace(self, float_frame):
        float_frame.rename(columns={'C': 'foo'})

also with holding a reference to original object

    def test_rename_inplace(self):
        float_frame = DataFrame(tm.getSeriesData())
        float_frame_orig = float_frame

        float_frame.rename(columns={'C': 'foo'})
        assert 'C' in float_frame
        assert 'foo' not in float_frame

        c_id = id(float_frame['C'])
        float_frame = float_frame.copy()
        float_frame.rename(columns={'C': 'foo'}, inplace=True)

        assert 'C' not in float_frame
        assert 'foo' in float_frame
        assert id(float_frame['foo']) != c_id
        assert float_frame is not float_frame_orig

@jbrockmendel
Copy link
Member Author

100/100 pass rate with

Is this on a platform/version that was previously failing? we've still got three failing builds

@simonjayhawkins
Copy link
Member

100/100 pass rate with

Is this on a platform/version that was previously failing? we've still got three failing builds

yes, i'm running locally on windows and getting failures with a copy of your branch.

$ git log
commit fde7cd88b23a4711398e4708fac29a95984fe16f (HEAD -> jbrockmendel-singletons)
Author: Brock Mendel <jbrockmendel@gmail.com>
Date:   Sun Jan 20 10:28:14 2019 -0800

    See if removing duplicate fixture is OK now...

four methods of preventing dereferencing always pass. fixture, @pytest.mark.parametrize, assigning original object to another variable and using different variable for the copy.

before the fixture was created, the float-frame object would have been created outside the function. so it makes sense that before the fixture was created that a reference would have been kept to the float-frame object and the test passed.

@jbrockmendel
Copy link
Member Author

Thanks for digging into this.

so it makes sense that before the fixture was created that a reference would have been kept to the float-frame object and the test passed.

I think I follow. For the time being I'm going to use the float_frame_orig edit.

@jbrockmendel jbrockmendel deleted the singletons branch April 5, 2020 17:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Clean Testing pandas testing functions or related to the test suite
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants