REF/TST: Stop using singleton fixtures #24769

jbrockmendel · 2019-01-14T17:52:46Z

Found when trying to collect reduction tests that we are still using a lot of singleton fixtures (agreed undesirable in #23701). This gets rid of singleton fixtures in tests.frame

Following this it will be easier to more usefully parametrize some of those tests.

gfyoung · 2019-01-14T18:00:24Z

we are still using a lot of singleton fixtures (agreed undesirable

@jbrockmendel : Indeed, too bad we can't lint these very easily AFAICT...

jbrockmendel · 2019-01-14T19:00:28Z

The test failures look very weird to me. @gfyoung any ideas?

gfyoung · 2019-01-14T20:44:05Z

The test failures look very weird to me.

@jbrockmendel : I see only one test failure regarding your refactoring of test_rename_inplace. Not sure exactly why it's failing now on those platforms, but the failure doesn't seem completely weird to me. It's seem to be suggesting that the .copy somehow didn't actually create a copy of the object.

jreback

really? this is the entire point of fixtures. pls show an example of how this actually helps

jbrockmendel · 2019-01-14T21:27:29Z

really? this is the entire point of fixtures.

@jreback we discussed this back in #23701 and you agreed "if it’s truly a singleton the. no a fixture is not warranted". Pls see the discussion there.

jreback · 2019-01-14T21:47:10Z

@jreback we discussed this back in #23701 and you agreed "if it’s truly a singleton the. no a fixture is not warranted". Pls see the discussion there.

key poiint. IF its truly a singleton. These are used all over the place, or at least should be. There are many many other files that actually just use TestData that should simply use these fixtures. They just have not been converted.

TomAugspurger · 2019-01-14T21:58:36Z

Are you two using different definitions of singleton? @jbrockmendel's is (roughly, ignoring things like setup / teardown that don't apply to these), "a non-parametrized fixture". In this case, using a helper method in tm. to generate the data seems equivalent to accepting the data as a fixture.

@jreback's definition might be "a fixture that's only used in a single test"?

jbrockmendel · 2019-01-14T22:46:49Z

@TomAugspurger thanks, I think you've got it right on the terminology mismatch.

Even if a fixture is used in many places, for one-liners or otherwise non-parametrized fixtures, we're better off inlining or putting them in pd.util.testing like this PR does. We get improved copy/paste-ability, reduce pytest lockin, and improve clarity. As a reader, which of these is clearer:

    def test_td64ser_div_numeric_scalar(self, tdser):
        expected = Series(['29.5D', '29.5D', 'NaT'], dtype='timedelta64[ns]')
        result = tdser / 2
        tm.assert_equal(result, expected)

    def test_td64ser_div_numeric_scalar(self):
        tdser = pd.Series(['59 Days', '59 Days', 'NaT'], dtype='timedelta64[ns]')
        expected = Series(['29.5D', '29.5D', 'NaT'], dtype='timedelta64[ns]')
        result = tdser / 2
        tm.assert_equal(result, expected)

I claim the latter is clearer by a wide margin.

…ngletons

codecov · 2019-01-14T22:50:11Z

Codecov Report

Merging #24769 into master will increase coverage by <.01%.
The diff coverage is 100%.

@@            Coverage Diff             @@
##           master   #24769      +/-   ##
==========================================
+ Coverage   92.38%   92.39%   +<.01%     
==========================================
  Files         166      166              
  Lines       52363    52400      +37     
==========================================
+ Hits        48377    48413      +36     
- Misses       3986     3987       +1

Flag	Coverage Δ
#multiple	`90.81% <100%> (ø)`	⬆️
#single	`42.9% <21.62%> (-0.01%)`	⬇️

Impacted Files	Coverage Δ
pandas/util/testing.py	`88.4% <100%> (+0.3%)`	⬆️
pandas/core/dtypes/common.py	`96.78% <0%> (ø)`	⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 6d3565a...032a74e. Read the comment docs.

codecov · 2019-01-14T22:50:11Z

Codecov Report

Merging #24769 into master will increase coverage by <.01%.
The diff coverage is 100%.

@@            Coverage Diff             @@
##           master   #24769      +/-   ##
==========================================
+ Coverage   92.38%   92.39%   +<.01%     
==========================================
  Files         166      166              
  Lines       52402    52439      +37     
==========================================
+ Hits        48414    48450      +36     
- Misses       3988     3989       +1

Flag	Coverage Δ
#multiple	`90.81% <100%> (ø)`	⬆️
#single	`42.88% <21.62%> (-0.01%)`	⬇️

Impacted Files	Coverage Δ
pandas/util/testing.py	`88.43% <100%> (+0.3%)`	⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update fa12b9e...3b87f34. Read the comment docs.

simonjayhawkins · 2019-01-14T23:43:32Z

@jbrockmendel

The test failures look very weird to me. @gfyoung any ideas?

if the only change to this test is replacing the fixture then it could be a change to N or K in tm.

simonjayhawkins · 2019-01-15T00:12:29Z

in cases like this pytest --showlocals would be helpful. you can use print statements to debug and the output should be displayed for failing tests.. that's the default locally.. haven't tried it with the ci

jbrockmendel · 2019-01-15T01:33:53Z

I can't reproduce this locally. Can anyone else?

jbrockmendel · 2019-01-15T02:27:38Z

If somehow the test behavior depends on whether a fixture is used, that would be troubling

TomAugspurger · 2019-01-15T02:37:52Z

I can reproduce the failure in 032a74e locally. And on 2253bdf

TomAugspurger · 2019-01-15T02:46:53Z

Interestingly, it seems to be a flaky test

 $ pytest pandas/tests/frame/test_alter_axes.py -k test_rename_inplace --count=5
================================================================================ test session starts ================================================================================
platform darwin -- Python 3.7.2, pytest-4.1.0, py-1.6.0, pluggy-0.7.1
rootdir: /Users/taugspurger/sandbox/pandas, inifile: setup.cfg
plugins: xdist-1.23.2, repeat-0.7.0, forked-0.2, faulthandler-1.5.0, cov-2.6.0, hypothesis-3.59.1
collected 3735 items / 3730 deselected

pandas/tests/frame/test_alter_axes.py .....                                                                                                                                   [100%]

===================================================================== 5 passed, 3730 deselected in 0.60 seconds =====================================================================

 $ pytest pandas/tests/frame/test_alter_axes.py -k test_rename_inplace --count=5
================================================================================ test session starts ================================================================================
platform darwin -- Python 3.7.2, pytest-4.1.0, py-1.6.0, pluggy-0.7.1
rootdir: /Users/taugspurger/sandbox/pandas, inifile: setup.cfg
plugins: xdist-1.23.2, repeat-0.7.0, forked-0.2, faulthandler-1.5.0, cov-2.6.0, hypothesis-3.59.1
collected 3735 items / 3730 deselected

pandas/tests/frame/test_alter_axes.py F....                                                                                                                                   [100%]

===================================================================================== FAILURES ======================================================================================
__________________________________________________________________ TestDataFrameAlterAxes.test_rename_inplace[1/5] __________________________________________________________________

self = <pandas.tests.frame.test_alter_axes.TestDataFrameAlterAxes object at 0x10ea03eb8>

    def test_rename_inplace(self):
        float_frame = DataFrame(tm.getSeriesData())

        float_frame.rename(columns={'C': 'foo'})
        assert 'C' in float_frame
        assert 'foo' not in float_frame

        c_id = id(float_frame['C'])
        float_frame = float_frame.copy()
        float_frame.rename(columns={'C': 'foo'}, inplace=True)

        assert 'C' not in float_frame
        assert 'foo' in float_frame
>       assert id(float_frame['foo']) != c_id
E       assert 4519856224 != 4519856224
E        +  where 4519856224 = id(3iGk6Mrg5P   -0.684722\nafMjhpeaJk   -1.726612\nlHGFsvRhY6    0.868385\n44rJSOpKPB    0.413032\ndBoNJu9vYI    0.017283\nHT4...4\ngwBsqTWequ    0.262972\nmBUNfZKGWs   -0.194150\nxNhGyHIH99    1.499986\nXnHXz6qUoM   -0.264342\nName: foo, dtype: float64)

pandas/tests/frame/test_alter_axes.py:693: AssertionError
================================================================ 1 failed, 4 passed, 3730 deselected in 0.85 seconds ================================================================

Just had 6/100 fail.

TomAugspurger · 2019-01-15T02:49:11Z

On the PR itself, I personally don't care about copy/paste-ability, and am not too concerned with pytest lock-in.

I do care about how long the test suite takes, so reducing that without sacrificing readability / maintainability is a win. And to me, a simple fixture that just provides a value (no setup / teardown) is equivalent to a method call inside the test.

jreback · 2019-01-15T02:52:22Z

i am -1 on this change - this goes against the entire maxim of using fixtures

this just puts more distance between what is being tested and construction

this is just reinventing the pytest and is a giant leap backwards

jreback

as commented

TomAugspurger · 2019-01-15T02:54:58Z

this just puts more distance between what is being tested and construction

How are you measuring distance? It's just a function call either way. And, depending on your editor, this may be less "distance" since goto definition typically works for function calls, but not necessarily for fixtures.

jbrockmendel · 2019-01-15T03:07:30Z

this just puts more distance between what is being tested and construction

Um... the opposite of that. In many cases the way the PR does it doesn't even have to call into tm, just inlines the construction.

What it does do is decrease the ambiguity about where something is defined.

this is just reinventing the pytest and is a giant leap backwards

Since when did pytest invent inlining code or function calls?

By this logic, every object we use in more than one test should be made into a fixture. Where would you draw the line and why?

I do care about how long the test suite takes, so reducing that without sacrificing readability / maintainability is a win. And to me, a simple fixture that just provides a value (no setup / teardown) is equivalent to a method call inside the test.

For the places where the fixtures are inlined instead of moved into tm, there is a strict decrease in the function call overhead (though I doubt it is big enough to be measurable). For others it should balance out exactly.

so reducing that without sacrificing readability / maintainability is a win

I find the readability strictly improved by this. It also makes it easier to figure out what refactoring opportunities are available, which has other maintainability/perf upside.

In the status quo there are a handful of fixtures in tests.frame.conftest that aren't used at all. In other cases tests have a fixture in the arguments but never actually use it; the linting doesn't detect that.

TomAugspurger · 2019-01-15T03:24:22Z

For the places where the fixtures are inlined instead of moved into tm, there is a strict decrease in the function call overhead (though I doubt it is big enough to be measurable). For others it should balance out exactly.

Just to be clear, I was talking about reduced setup time from avoiding fixtures (btw, do you have rough at how much this shaves off? I suppose a pytest --collect-only would show it?)

jbrockmendel · 2019-01-15T03:42:45Z

Just to be clear, I was talking about reduced setup time from avoiding fixtures (btw, do you have rough at how much this shaves off? I suppose a pytest --collect-only would show it?)

N=5 they are indistinguishable.

Incidentally, for me the collection takes about 12% of test runtime (with parallelization). I think some of the way to cut down on that is to push back on over-parametrization.

simonjayhawkins · 2019-01-17T01:58:54Z

just had this ci failure..

    def test_setindex(self, string_series):
        # wrong type
        msg = (r"Index\(\.\.\.\) must be called with a collection of some"
               r" kind, None was passed")
        with pytest.raises(TypeError, match=msg):
            string_series.index = None
    
        # wrong length
        msg = (r"Length mismatch: Expected axis has 30 elements, new values"
               r" have 29 elements")
        with pytest.raises(ValueError, match=msg):
>           string_series.index = np.arange(len(string_series) - 1)
E           AssertionError: Pattern 'Length mismatch: Expected axis has 30 elements, new values have 29 elements' not found in 'Length mismatch: Expected axis has 100 elements, new values have 99 elements'
pandas/tests/series/test_alter_axes.py:28: AssertionError
___________ TestSeriesConstructors.test_constructor_dtype_datetime64 ___________

so I think that tm.N and tm.K take on different values depending on the order that the tests are run.

…ngletons

TomAugspurger · 2019-01-17T20:49:21Z

Seems like we have tests that set those, which seems iffy

$ ag 'tm.N ='
doc/source/reshaping.rst
18:   tm.N = 3
43:   tm.N = 3

pandas/tests/test_multilevel.py
50:        tm.N = 100

pandas/tests/plotting/test_datetimelike.py
317:        tm.N = 300
319:        tm.N = n

pandas/tests/indexing/multiindex/conftest.py
24:    tm.N = 100

…ngletons

jbrockmendel · 2019-01-18T01:03:22Z

@TomAugspurger agreed that altering tm.K and tm.N is not great. But I don't see how that would change the affected test's behavior, do you?

…ngletons

simonjayhawkins · 2019-01-22T18:25:58Z

Interestingly, it seems to be a flaky test

float-frame defined in function..

$ pytest pandas/tests/frame/test_alter_axes.py -k test_rename_inplace --count=100 --tb=no
============================= test session starts =============================
platform win32 -- Python 3.7.2, pytest-4.1.1, py-1.7.0, pluggy-0.8.1
hypothesis profile 'ci' -> deadline=500, timeout=unlimited, suppress_health_check=[HealthCheck.too_slow], database=DirectoryBasedExampleDatabase('C:\\Users\\simon\\OneDrive\\code\\pandas-simonjayhawkins\\.hypothesis\\examples')
rootdir: C:\Users\simon\OneDrive\code\pandas-simonjayhawkins, inifile: setup.cfg
plugins: xdist-1.26.0, repeat-0.7.0, forked-0.2, cov-2.6.1, hypothesis-4.0.1
collected 49500 items / 49400 deselected

pandas\tests\frame\test_alter_axes.py .F.....FF.....FFF......F.......FFF [ 34%]
......F.......FFF......F.......FF......F.....F.....F....F...F.....       [100%]

=========== 22 failed, 78 passed, 49400 deselected in 13.29 seconds ===========

float-frame fixture..

$ pytest pandas/tests/frame/test_alter_axes.py -k test_rename_inplace --count=100 --tb=no
============================= test session starts =============================
platform win32 -- Python 3.7.2, pytest-4.1.1, py-1.7.0, pluggy-0.8.1
hypothesis profile 'ci' -> deadline=500, timeout=unlimited, suppress_health_check=[HealthCheck.too_slow], database=DirectoryBasedExampleDatabase('C:\\Users\\simon\\OneDrive\\code\\pandas-simonjayhawkins\\.hypothesis\\examples')
rootdir: C:\Users\simon\OneDrive\code\pandas-simonjayhawkins, inifile: setup.cfg
plugins: xdist-1.26.0, repeat-0.7.0, forked-0.2, cov-2.6.1, hypothesis-4.0.1
collected 49500 items / 49400 deselected

pandas\tests\frame\test_alter_axes.py .................................. [ 34%]
..................................................................       [100%]

================ 100 passed, 49400 deselected in 13.20 seconds ================

change code to ...

    def test_rename_inplace(self):
        float_frame = DataFrame(tm.getSeriesData())

        float_frame.rename(columns={'C': 'foo'})
        assert 'C' in float_frame
        assert 'foo' not in float_frame

        c_id = id(float_frame['C'])
        float_frame2 = float_frame.copy()
        float_frame2.rename(columns={'C': 'foo'}, inplace=True)

        assert 'C' not in float_frame2
        assert 'foo' in float_frame2
        assert id(float_frame2['foo']) != c_id

$ pytest pandas/tests/frame/test_alter_axes.py -k test_rename_inplace --count=100 --tb=no
============================= test session starts =============================
platform win32 -- Python 3.7.2, pytest-4.1.1, py-1.7.0, pluggy-0.8.1
hypothesis profile 'ci' -> deadline=500, timeout=unlimited, suppress_health_check=[HealthCheck.too_slow], database=DirectoryBasedExampleDatabase('C:\\Users\\simon\\OneDrive\\code\\pandas-simonjayhawkins\\.hypothesis\\examples')
rootdir: C:\Users\simon\OneDrive\code\pandas-simonjayhawkins, inifile: setup.cfg
plugins: xdist-1.26.0, repeat-0.7.0, forked-0.2, cov-2.6.1, hypothesis-4.0.1
collected 49500 items / 49400 deselected

pandas\tests\frame\test_alter_axes.py .................................. [ 34%]
..................................................................       [100%]

================ 100 passed, 49400 deselected in 11.97 seconds ================
(pandas-dev)

so it appears to be due to re-use of memory space after dereferencing

h-vetinari · 2019-01-22T18:47:22Z

@jreback @jbrockmendel @simonjayhawkins
There reason most of the fixtures coming out of TestData are singletons at all is that there is an ongoing (cough) transition from the old TestData-attributes-as-fixtures towards proper fixtures, see #22236 (where @jreback asked me to open the following) #22471 #22550.

I haven't touched these issues that much anymore, since it was supposed to be a "good first issue" (and had some response along those lines for a while), and I was busy with other PRs.

…ngletons

simonjayhawkins · 2019-01-24T09:21:42Z

100/100 pass rate with

    @pytest.mark.parametrize('float_frame', [DataFrame(tm.getSeriesData())])
    def test_rename_inplace(self, float_frame):
        float_frame.rename(columns={'C': 'foo'})

also with holding a reference to original object

    def test_rename_inplace(self):
        float_frame = DataFrame(tm.getSeriesData())
        float_frame_orig = float_frame

        float_frame.rename(columns={'C': 'foo'})
        assert 'C' in float_frame
        assert 'foo' not in float_frame

        c_id = id(float_frame['C'])
        float_frame = float_frame.copy()
        float_frame.rename(columns={'C': 'foo'}, inplace=True)

        assert 'C' not in float_frame
        assert 'foo' in float_frame
        assert id(float_frame['foo']) != c_id
        assert float_frame is not float_frame_orig

jbrockmendel · 2019-01-24T18:11:10Z

100/100 pass rate with

Is this on a platform/version that was previously failing? we've still got three failing builds

simonjayhawkins · 2019-01-24T18:36:44Z

100/100 pass rate with

Is this on a platform/version that was previously failing? we've still got three failing builds

yes, i'm running locally on windows and getting failures with a copy of your branch.

$ git log
commit fde7cd88b23a4711398e4708fac29a95984fe16f (HEAD -> jbrockmendel-singletons)
Author: Brock Mendel <jbrockmendel@gmail.com>
Date:   Sun Jan 20 10:28:14 2019 -0800

    See if removing duplicate fixture is OK now...

four methods of preventing dereferencing always pass. fixture, @pytest.mark.parametrize, assigning original object to another variable and using different variable for the copy.

before the fixture was created, the float-frame object would have been created outside the function. so it makes sense that before the fixture was created that a reference would have been kept to the float-frame object and the test passed.

jbrockmendel · 2019-01-24T18:46:15Z

Thanks for digging into this.

so it makes sense that before the fixture was created that a reference would have been kept to the float-frame object and the test passed.

I think I follow. For the time being I'm going to use the float_frame_orig edit.

…ngletons

Stop using singleton fixtures

d013103

gfyoung added Testing pandas testing functions or related to the test suite Clean labels Jan 14, 2019

jreback requested changes Jan 14, 2019

View reviewed changes

jbrockmendel added 2 commits January 14, 2019 14:48

Merge branch 'master' of https://github.com/pandas-dev/pandas into si…

2253bdf

…ngletons

troubleshoot

032a74e

troubleshoot

37a03ad

jreback requested changes Jan 15, 2019

View reviewed changes

jbrockmendel added 2 commits January 16, 2019 18:49

fixture for broken case

f17435b

Merge branch 'master' of https://github.com/pandas-dev/pandas into si…

359e19f

…ngletons

jbrockmendel added 2 commits January 17, 2019 14:19

Merge branch 'master' of https://github.com/pandas-dev/pandas into si…

b6a29c5

…ngletons

update kludge

689c203

simonjayhawkins mentioned this pull request Jan 18, 2019

TST: remove patches to pandas.util.testing.N #24826

Merged

jbrockmendel added 2 commits January 20, 2019 10:27

Merge branch 'master' of https://github.com/pandas-dev/pandas into si…

62e319b

…ngletons

See if removing duplicate fixture is OK now...

fde7cd8

jbrockmendel mentioned this pull request Jan 22, 2019

TST: Remove subset of singleton fixtures #24873

Closed

jbrockmendel mentioned this pull request Jan 24, 2019

TST: inline empty_frame = DataFrame({}) fixture #24886

Merged

jbrockmendel added 2 commits January 23, 2019 16:51

Merge branch 'master' of https://github.com/pandas-dev/pandas into si…

5292d1f

…ngletons

rebase fixup

fab45b0

jbrockmendel mentioned this pull request Jan 24, 2019

TST: remove never-used singleton fixtures #24885

Merged

jbrockmendel added 2 commits January 24, 2019 10:46

Merge branch 'master' of https://github.com/pandas-dev/pandas into si…

69a3326

…ngletons

retain reference

3b87f34

jbrockmendel closed this Feb 3, 2019

jbrockmendel deleted the singletons branch April 5, 2020 17:37

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

REF/TST: Stop using singleton fixtures #24769

REF/TST: Stop using singleton fixtures #24769

jbrockmendel commented Jan 14, 2019

gfyoung commented Jan 14, 2019

jbrockmendel commented Jan 14, 2019

gfyoung commented Jan 14, 2019

jreback left a comment

jbrockmendel commented Jan 14, 2019

jreback commented Jan 14, 2019

TomAugspurger commented Jan 14, 2019

jbrockmendel commented Jan 14, 2019

codecov bot commented Jan 14, 2019

codecov bot commented Jan 14, 2019 •

edited

Loading

simonjayhawkins commented Jan 14, 2019

simonjayhawkins commented Jan 15, 2019

jbrockmendel commented Jan 15, 2019

jbrockmendel commented Jan 15, 2019

TomAugspurger commented Jan 15, 2019 •

edited

Loading

TomAugspurger commented Jan 15, 2019

TomAugspurger commented Jan 15, 2019

jreback commented Jan 15, 2019

jreback left a comment

TomAugspurger commented Jan 15, 2019

jbrockmendel commented Jan 15, 2019

TomAugspurger commented Jan 15, 2019

jbrockmendel commented Jan 15, 2019

simonjayhawkins commented Jan 17, 2019

TomAugspurger commented Jan 17, 2019

jbrockmendel commented Jan 18, 2019

simonjayhawkins commented Jan 22, 2019

h-vetinari commented Jan 22, 2019

simonjayhawkins commented Jan 24, 2019

jbrockmendel commented Jan 24, 2019

simonjayhawkins commented Jan 24, 2019

jbrockmendel commented Jan 24, 2019

REF/TST: Stop using singleton fixtures #24769

REF/TST: Stop using singleton fixtures #24769

Conversation

jbrockmendel commented Jan 14, 2019

gfyoung commented Jan 14, 2019

jbrockmendel commented Jan 14, 2019

gfyoung commented Jan 14, 2019

jreback left a comment

Choose a reason for hiding this comment

jbrockmendel commented Jan 14, 2019

jreback commented Jan 14, 2019

TomAugspurger commented Jan 14, 2019

jbrockmendel commented Jan 14, 2019

codecov bot commented Jan 14, 2019

Codecov Report

codecov bot commented Jan 14, 2019 • edited Loading

Codecov Report

simonjayhawkins commented Jan 14, 2019

simonjayhawkins commented Jan 15, 2019

jbrockmendel commented Jan 15, 2019

jbrockmendel commented Jan 15, 2019

TomAugspurger commented Jan 15, 2019 • edited Loading

TomAugspurger commented Jan 15, 2019

TomAugspurger commented Jan 15, 2019

jreback commented Jan 15, 2019

jreback left a comment

Choose a reason for hiding this comment

TomAugspurger commented Jan 15, 2019

jbrockmendel commented Jan 15, 2019

TomAugspurger commented Jan 15, 2019

jbrockmendel commented Jan 15, 2019

simonjayhawkins commented Jan 17, 2019

TomAugspurger commented Jan 17, 2019

jbrockmendel commented Jan 18, 2019

simonjayhawkins commented Jan 22, 2019

h-vetinari commented Jan 22, 2019

simonjayhawkins commented Jan 24, 2019

jbrockmendel commented Jan 24, 2019

simonjayhawkins commented Jan 24, 2019

jbrockmendel commented Jan 24, 2019

codecov bot commented Jan 14, 2019 •

edited

Loading

TomAugspurger commented Jan 15, 2019 •

edited

Loading