CI: benchmark build is taking a long time #44450

jreback · 2021-11-14T16:33:34Z

xref https://github.com/pandas-dev/pandas/runs/4204157179?check_suite_focus=true

have seen this on multiple PRs. Maybe something recently added causes this to timeout.

jreback · 2021-11-14T16:33:41Z

cc @pandas-dev/pandas-core

alimcmaster1 · 2021-11-14T23:11:55Z

Could be related to #44359

jorisvandenbossche · 2021-11-15T15:21:07Z

I think this is mostly caused by the continuous expansion of our suite, with some additions recently that had a big impact on total runtime.
While it's of course good to to keep increasing the coverage of our benchmark suite, we should also be careful about keeping the total runtime manageable by keeping parametrizations limited where they are not adding much value / by ensuring the individual time functions don't take too much time.

I ran the benchmarks locally with the --quick options (so without repetitions, as we do on our CI) and did a quick analysis of the results (ASV saves those in a json file, which I am trying to upload as artifact on our CI in #44464).

The code for this and some results can be seen at https://nbviewer.org/gist/jorisvandenbossche/0af1c0a20ef187197ecdcfdb3545306a

Based on those results, some of the very slow ones that contribute a lot to the total runtime of the benchmarks:

groupby.GroupByMethods.time_dtype_as_group/time_as_field with 'describe'-'direct'-10 parametrization -> the ncols (10 here) parameter was added recently in PERF: GroupBy.any/all operate blockwise instead of column-wise #42841, and is clearly too big for describe (cc @jbrockmendel)
groupby.String.time_str_func benchmarks are generally very slow (5-12 seconds for some of the parameter combinations) -> those were added recently in PERF: groupby with string dtype #43634 (cc @debnathshoham)
rolling.NumbaEngine.time_expanding_apply benchmarks are generally quite fast individually, but because of the large parametrization matrix contributes a lot to the total time -> added two weeks ago in BENCH: Add more numba rolling benchmarks #44283 (cc @mroeschke)

(the first two items are by far adding the most time of the whole suite)

jreback · 2021-11-15T23:58:23Z

going to close this based on #44475, but will open a new issue about separating the benchmarks to another build.

jbrockmendel · 2021-11-16T00:42:43Z

i expect joris's diagnosis is correct and trimming the groupby parameterizations will go a long way towards fixing this.

for e.g. ncols param that often has cases with [1, 2, 5, 10], might find a way to only do the ncols=1, ncols=2 cases on the CI?

jorisvandenbossche · 2021-11-16T06:29:30Z

The rolling benchmarks were not the biggest "offender", so let's keep this open for at least the other two I listed.

for e.g. ncols param that often has cases with [1, 2, 5, 10], might find a way to only do the ncols=1, ncols=2 cases on the CI?

Or just skip ncols > 2 always for at least describe? (might need to check if there are other ones that are much slower)
Or does the 5 instead of 2 columns add a lot of additional information? (with 2 we already have a case of "not a single column")) This should certainly be sufficient to catch regressions in the describe algo in general I think?

jbrockmendel · 2021-11-16T16:20:12Z

seems reasonable

jorisvandenbossche · 2021-12-07T09:34:42Z

@jbrockmendel the benchmark function taking the most overall time is now tslibs.period.TimeDT64ArrToPeriodArr.time_dt64arr_to_periodarr (after groupby.String.time_str_func, which is already mentioned above). The main aspect contributing to this overall time is the huge parametrization (450 combinations, for benchmarking a single function).
For example, the benchmark runs for 5 different sizes of the data ([0, 1, 100, 10 ** 4, 10 ** 6]). Is there any reason to think varying the size might provide useful information in this specific case? Could we also run this just for a single size?
Also, it is now run for several timezones and several freqs. Those are both useful parametrizations (affecting the time), but we could also test them independently? (all different timezones with one fixed freq, and all different freqs with one fixed tz, instead of the full product of combinations)

jbrockmendel · 2021-12-07T16:56:39Z

Is there any reason to think varying the size might provide useful information in this specific case? Could we also run this just for a single size?

I think the runtime is expected to be affine, so in principle we'd want/need 2 sizes.

More generally, the tslibs asvs are written so that they can be skipped for PRs that don't touch tslibs (ditto the benchmarks/libs.py).

jreback added the CI Continuous Integration label Nov 14, 2021

jreback added this to the 1.4 milestone Nov 14, 2021

jorisvandenbossche mentioned this issue Nov 15, 2021

CI: publish benchmarks log as artifact #44464

Closed

mroeschke mentioned this issue Nov 15, 2021

REF: rolling benchmarks to reduce redundant benchmarks #44475

Merged

1 task

jreback closed this as completed Nov 15, 2021

jreback mentioned this issue Nov 15, 2021

CI/ASV: create separate benchmarks build #44477

Closed

jorisvandenbossche reopened this Nov 16, 2021

mroeschke added the Benchmark Performance (ASV) benchmarks label Nov 21, 2021

jorisvandenbossche mentioned this issue Nov 24, 2021

ASV: reduce overall run time for GroupByMethods benchmarks #44604

Merged

datapythonista mentioned this issue Jan 5, 2022

Benchmarks grant #45049

Closed

jreback modified the milestones: 1.4, 1.5 Jan 8, 2022

datapythonista mentioned this issue Feb 11, 2022

DO NOT MERGE: Testing if benchmarks build is slow with latest asv #45937

Closed

datapythonista mentioned this issue Apr 1, 2022

CI: Add durations to benchmarks build #46598

Closed

mroeschke removed this from the 1.5 milestone Aug 15, 2022

debnathshoham mentioned this issue Oct 29, 2022

asv groupby.string smaller_faster #49385

Merged

1 task

mroeschke mentioned this issue Mar 29, 2023

PERF: Some asv tests take a long time to setup #16803

Closed

datapythonista assigned DeaMariaLeon Sep 5, 2023

jorisvandenbossche mentioned this issue Oct 20, 2023

Set up benchmarks server #55007

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

CI: benchmark build is taking a long time #44450

CI: benchmark build is taking a long time #44450

jreback commented Nov 14, 2021

jreback commented Nov 14, 2021

alimcmaster1 commented Nov 14, 2021

jorisvandenbossche commented Nov 15, 2021 •

edited

Loading

jreback commented Nov 15, 2021

jbrockmendel commented Nov 16, 2021

jorisvandenbossche commented Nov 16, 2021

jbrockmendel commented Nov 16, 2021

jorisvandenbossche commented Dec 7, 2021

jbrockmendel commented Dec 7, 2021

CI: benchmark build is taking a long time #44450

CI: benchmark build is taking a long time #44450

Comments

jreback commented Nov 14, 2021

jreback commented Nov 14, 2021

alimcmaster1 commented Nov 14, 2021

jorisvandenbossche commented Nov 15, 2021 • edited Loading

jreback commented Nov 15, 2021

jbrockmendel commented Nov 16, 2021

jorisvandenbossche commented Nov 16, 2021

jbrockmendel commented Nov 16, 2021

jorisvandenbossche commented Dec 7, 2021

jbrockmendel commented Dec 7, 2021

jorisvandenbossche commented Nov 15, 2021 •

edited

Loading