Indexing benchmarking #1851

fujiisoup · 2018-01-23T00:27:29Z

Relates to Needs performance check / improvements in value assignment of DataArray #1771

Just added some benchmarks for basic, outer, and vectorized indexing and assignments.

fujiisoup · 2018-01-23T00:29:04Z

.gitignore

@@ -34,6 +34,9 @@ nosetests.xml
 .cache
 .ropeproject/

+# asv environments
+.asv


Is there any problem if .asv is added to .gitignore?
A bunch of files in this directory makes my gui-git-client freeze.

shoyer · 2018-01-23T00:40:33Z

asv_bench/benchmarks/__init__.py

@@ -29,3 +29,12 @@ def randn(shape, frac_nan=None, chunks=None):
        x.flat[inds] = np.nan

    return x
+
+
+def randint(low, high=None, size=None, frac_minus=None):


Can we use a seed for all these random numbers? That should decrease the variance for these test results.

See line 9.

jhamman was careful enough :)

A global seed for all benchmarks isn't a great idea -- it means that results will vary depending upon whether we run the full benchmark suite or not, and depending upon the order in which benchmark tests are run. It would be better to set a seed for each test.

it means that results will vary depending upon whether we run the full benchmark suite or not, and depending upon the order in which benchmark tests are run.

Thanks for the details.
Understood.
Done.

shoyer · 2018-01-23T00:41:56Z

asv_bench/benchmarks/indexing.py

+
+
+def time_indexing_basic():
+    for ind in basic_indexes:


Rather than a loop, another option would be to separate these cases into separate benchmarks. That would give more granular information.

shoyer · 2018-01-23T00:42:42Z

asv_bench/benchmarks/indexing.py

+
+
+try:
+    ds_dask = ds.chunk({'x': 100, 'y': 50, 't': 50})


Another option is to use a subclass, like I did in #1847

shoyer · 2018-01-23T00:43:22Z

asv_bench/benchmarks/indexing.py

+
+    def time_indexing_basic_dask():
+        for ind in basic_indexes:
+            ds_dask.isel(**ind)


It's probably a good idea to call .load() on these to load them into numpy. Otherwise it's only measure the time to construct the dask graph, not to evaluate it. Usually evaluation time dominates.

fujiisoup · 2018-01-23T01:36:43Z

Thanks. All done.

fujiisoup · 2018-01-23T03:08:19Z

It might be another issue, but I think we can have a clear link to our benchmark result page.
(maybe in our doc, README or in our PR template?)

jhamman · 2018-01-24T06:29:19Z

@fujiisoup - I don't see any problem including a link in the docs (probably on the testing section) and/or on the readme. Its maybe too soon to include in the PR template since we just don't have very good coverage yet.

jhamman

@fujiisoup - glad to see the ASV setup getting used. Everything here looks good to me.

fujiisoup added 4 commits January 1, 2018 12:47

Start adding benchmarking for basic operations.

59f9d99

Merge branch 'master' into benchmark_indexing

8ab3796

Update indexing.py

336e488

Benchmark for indexing.

dbe2864

fujiisoup commented Jan 23, 2018

View reviewed changes

shoyer reviewed Jan 23, 2018

View reviewed changes

Make it a class.

c03b002

fujiisoup added 2 commits January 23, 2018 10:37

Add key to setup

755f507

Use seed in randn and randint. Fix bug in dask benchmark

57da6e5

jhamman approved these changes Jan 24, 2018

View reviewed changes

shoyer approved these changes Jan 24, 2018

View reviewed changes

fujiisoup merged commit 04974b9 into pydata:master Jan 24, 2018

fujiisoup mentioned this pull request Jan 25, 2018

Adding a link to asv benchmark. #1858

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Indexing benchmarking #1851

Indexing benchmarking #1851

fujiisoup commented Jan 23, 2018

fujiisoup Jan 23, 2018

shoyer Jan 23, 2018

fujiisoup Jan 23, 2018

shoyer Jan 23, 2018

fujiisoup Jan 23, 2018

shoyer Jan 23, 2018

shoyer Jan 23, 2018

shoyer Jan 23, 2018

fujiisoup commented Jan 23, 2018

fujiisoup commented Jan 23, 2018

jhamman commented Jan 24, 2018

jhamman left a comment

Indexing benchmarking #1851

Indexing benchmarking #1851

Conversation

fujiisoup commented Jan 23, 2018

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

fujiisoup commented Jan 23, 2018

fujiisoup commented Jan 23, 2018

jhamman commented Jan 24, 2018

jhamman left a comment

Choose a reason for hiding this comment