Improve dot performance #389

daletovar · 2020-08-03T19:58:56Z

improves dot for two COO arrays, addresses Performance bencmarks comparison with scipy.sparse #331
- I used the same algorithm that scipy uses.
- It's a bit slower than scipy because to ensure sorted coordinates, we have to do some sorting.
add dot for GCXS arrays
- Uses the same algorithm.
- Works for csr @ csr and csc @ csc
- I found that the csr @ csc dot product-based algorithm was just to slow, so we convert one of the arrays.
add io for GCXS arrays

Update core.py

Raises an error for negative dimensions.

Co-Authored-By: Hameer Abbasi <einstein.edison@gmail.com>

hameerabbasi · 2020-08-04T07:52:20Z

LGTM caught some valid concerns -- Could you address those?

hameerabbasi · 2020-08-04T07:52:49Z

And the code coverage is a bit lacking. 😕

daletovar · 2020-08-04T16:40:35Z

Thanks @rgommers, I'll make sure to do that going forward.

daletovar · 2020-08-04T16:57:31Z

With regards to the sorting, the issue is that the dot algorithm produces results that are not sorted:

In [1]: import scipy.sparse as ss
In [2]: x = ss.random(1000,1000,density=.2,format='csr')
In [3]: y = ss.random(1000,1000,density=.2,format='csr')
In [4]: res = x @ y
In [5]: res.indptr[1]
Out[5]: 1000
In [6]: res.indices
Out[6]: array([554, 328, 723, ...,  14,   7,   6], dtype=int32)

The indices for each row of the resulting matrix are out-of-order. So I added something like this:

for start, stop in zip(res.indptr[:-1], res.indptr[1:]):
    order = np.argsort(res.indices[start:stop])
    res.data[start:stop] = res.data[start:stop][order]
    res.indices[start:stop] = res.indices[start:stop][order]

daletovar · 2020-08-04T19:22:04Z

@hameerabbasi, yes I'll see if I can fix those.

lgtm-com · 2020-08-05T19:39:43Z

This pull request introduces 2 alerts and fixes 1 when merging 5b4f08a into e682a8b - view on LGTM.com

new alerts:

1 for Unused local variable
1 for Unused import

fixed alerts:

1 for Unused import

lgtm-com · 2020-08-05T20:41:07Z

This pull request introduces 1 alert and fixes 1 when merging eb8c0e5 into e682a8b - view on LGTM.com

new alerts:

1 for Unused import

fixed alerts:

1 for Unused import

lgtm-com · 2020-08-05T23:25:20Z

This pull request introduces 1 alert and fixes 1 when merging ae63040 into e682a8b - view on LGTM.com

new alerts:

1 for Unused import

fixed alerts:

1 for Unused import

daletovar · 2020-08-05T23:41:34Z

Here's an initial mark:

In [2]: x = sparse.random((5000, 5000), density=0.001)
In [3]: y = sparse.random((5000, 5000), density=0.001)
In [4]: xs = x.tocsr()
In [5]: ys = y.tocsr()
In [6]: x @ y # jit
Out[6]: <COO: shape=(5000, 5000), dtype=float64, nnz=124139, fill_value=0.0>
In [7]: %timeit x @ y
10.5 ms ± 104 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
In [8]: %timeit xs @ ys
1.14 ms ± 2.9 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

hameerabbasi · 2020-08-06T07:00:00Z

sparse/_common.py

+            # ensure sorted indices
+            order = np.argsort(indices[indptr[i] : indptr[i + 1]])
+            data[indptr[i] : indptr[i + 1]] = data[indptr[i] : indptr[i + 1]][order]
+            indices[indptr[i] : indptr[i + 1]] = indices[indptr[i] : indptr[i + 1]][
+                order
+            ]


Can we possibly sort before putting in the data? I suspect it'll be a lot faster.

I'm not quite sure what you mean. Could you give an example?

Well, the original CSR-CSR multiplication algorithm doesn't require sorting, correct? I suspect this one only requires it because you're passing in unsorted data in the first place (or iterating over it in a way that's unsorted).

Could you possibly pre-sort the data so the output here will be sorted?

The original CSR-CSR multiplication algorithm iterates over the data in a way that's unsorted. In scipy, the resulting csr_matrix has indices that are out-of-order:

In [13]: xs.has_sorted_indices Out[13]: 1 In [14]: res = xs @ ys In [15]: res.has_sorted_indices Out[15]: 0

Since many of the operations for COO and GCXS rely on maintaining sorted coordinates, I was thinking it would be best to ensure that the result of dot has sorted coords.

In that case could you post a benchmark which includes sorting for the SciPy version?

%%timeit x @ y # coo coo 10.2 ms ± 51.7 µs per loop (mean ± std. dev. of 7 runs, 100 loops each) %%timeit res = xs @ ys res.sort_indices() 4.05 ms ± 8.83 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

I'm going to test doing just one sort at the end to see how that does.

%%timeit x @ y # coo coo - one sort at the end 7.55 ms ± 96.1 µs per loop (mean ± std. dev. of 7 runs, 100 loops each) %%timeit res = xs @ ys res.sort_indices() 4.04 ms ± 15.1 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

It looks like doing one sort at the end is better.

Could you also post some GCXS-GCXS benchmarks for direct comparison with SciPy?

%%timeit x @ y # csr csr 6.32 ms ± 107 µs per loop (mean ± std. dev. of 7 runs, 100 loops each) x = x.change_compressed_axes([1]) y = y.change_compressed_axes([1]) x @ y %%timeit x @ y # csc csc 6.11 ms ± 46.5 µs per loop (mean ± std. dev. of 7 runs, 100 loops each) x = x.tocoo() y = y.tocoo() x @ y %%timeit x @ y # coo coo 7.23 ms ± 93.7 µs per loop (mean ± std. dev. of 7 runs, 100 loops each) %%timeit res = xs @ ys res.sort_indices() 4.08 ms ± 49.1 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

hameerabbasi · 2020-08-06T07:34:56Z

sparse/tests/test_dot.py

+    assert_eq(sparse.dot(sa, b), sparse.dot(a, sb))
+    assert_eq(np.dot(a, b), sparse.dot(sa, sb))
+
+    if hasattr(operator, "matmul"):


This was a compat check for Python 2. It can be removed.

sparse/_io.py

hameerabbasi · 2020-08-06T07:40:34Z

sparse/tests/test_io.py

-def test_save_load_npz_file(compression):
-    x = sparse.random((2, 3, 4, 5), density=0.25)
+@pytest.mark.parametrize("format", ["coo", "gcxs"])
+def test_save_load_npz_file(compression, format):


Can you rewrite this using tmp_path: https://docs.pytest.org/en/latest/tmpdir.html#the-tmp-path-fixture

Same for the next test.

hameerabbasi

A few changes. Looks pretty awesome overall!

daletovar · 2020-08-06T21:49:49Z

Thanks so much for the review, @hameerabbasi. I think those were really helpful changes. Is there anything else you'd like to see?

hameerabbasi · 2020-08-07T10:23:26Z

sparse/tests/test_dot.py

-    if hasattr(operator, "matmul"):
-        # Basic equivalences
-        assert_eq(operator.matmul(a, b), operator.matmul(sa, sb))
-        # Test that SOO's and np.array's combine correctly
-        # Not possible due to https://github.com/numpy/numpy/issues/9028
-        # assert_eq(eval("a @ sb"), eval("sa @ b"))
-
-


This should be added back, only the "if" was unnecessary.

hameerabbasi · 2020-08-08T09:17:02Z

I have one final question: Does dot now return GCXS by default for COO/ndarray inputs? If so, we should change this.

And one possibility for future work: Writing a sparse-dense matrix multiplication kernel separately.

daletovar · 2020-08-09T19:39:07Z

No, the default behavior is the same. I did make it so that GCXS @ COO results in a GCXS though.

And one possibility for future work: Writing a sparse-dense matrix multiplication kernel separately.

+1

I'm away from my computer the next couple of days so I apologize if I don't get back right away about anymore suggestions.

hameerabbasi · 2020-08-09T20:39:47Z

Thanks, @daletovar!

daletovar and others added 30 commits April 18, 2019 22:23

Update core.py

9ea4053

Merge pull request #1 from daletovar/daletovar-patch-1

aa14159

Update core.py

Update test_coo.py

fa9ef22

Update core.py

1683e00

Raises an error for negative dimensions.

Update test_coo.py

c70d397

Add JUnit directory to gitignore.

de14873

Fix Flake8 issues.

fa80a62

Fix up tests and code.

91e0367

Add docs.

e980993

Resolved merge.

c826e62

change gxcs with gcxs

a1f948c

Update __init__.py

81b1b03

add html_table

cf085fb

add _repr_html_

551629b

add self.nbytes property and self.format attribute

de82d5b

fix empty indptr for 1d and add self.format

2a0fdba

add self.format

16c707e

remove whitespace

937e819

remove failed example

80a958b

Update _utils.py

fedd68a

Update compressed.py

d7516d4

add format property

8827ac4

add format property

9a4fd74

add format

667eb57

remove spurious properties, add docs

53ccd0f

Update sparse/_utils.py

a41a054

Co-Authored-By: Hameer Abbasi <einstein.edison@gmail.com>

update html table

0a441ee

Update test_compressed.py

9cfdb09

Merge branch 'master' into sparse_html

9f8ccf6

Update compressed.py

77eabe5

daletovar added 2 commits August 5, 2020 12:20

remove redundant code

fbe404c

add keyerror

5b4f08a

remove unused variable

eb8c0e5

remove coverage for jitted funcs

ae63040

hameerabbasi reviewed Aug 6, 2020

View reviewed changes

sparse/_io.py Show resolved Hide resolved

hameerabbasi reviewed Aug 6, 2020

View reviewed changes

hameerabbasi requested changes Aug 6, 2020

View reviewed changes

daletovar added 4 commits August 6, 2020 14:05

do all sorting after computing dot

0a741ff

update tests, docstrings

2d7b159

remove python 2 compat check

1114383

add formatting

2b7a8e4

hameerabbasi reviewed Aug 7, 2020

View reviewed changes

add back @ test

1296798

hameerabbasi merged commit 3a6d955 into pydata:master Aug 9, 2020

thangleiter mentioned this pull request Sep 3, 2020

dot does not handle non-float COO arguments. #403

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improve dot performance #389

Improve dot performance #389

daletovar commented Aug 3, 2020

hameerabbasi commented Aug 4, 2020

hameerabbasi commented Aug 4, 2020

daletovar commented Aug 4, 2020

daletovar commented Aug 4, 2020

daletovar commented Aug 4, 2020

lgtm-com bot commented Aug 5, 2020

lgtm-com bot commented Aug 5, 2020

lgtm-com bot commented Aug 5, 2020

daletovar commented Aug 5, 2020

hameerabbasi Aug 6, 2020

daletovar Aug 6, 2020

hameerabbasi Aug 6, 2020

daletovar Aug 6, 2020

hameerabbasi Aug 6, 2020

daletovar Aug 6, 2020

daletovar Aug 6, 2020

hameerabbasi Aug 6, 2020

daletovar Aug 6, 2020

hameerabbasi Aug 6, 2020

hameerabbasi Aug 6, 2020

hameerabbasi Aug 6, 2020

hameerabbasi left a comment

daletovar commented Aug 6, 2020

hameerabbasi Aug 7, 2020

hameerabbasi commented Aug 8, 2020 •

edited

Loading

daletovar commented Aug 9, 2020

hameerabbasi commented Aug 9, 2020

Improve dot performance #389

Improve dot performance #389

Conversation

daletovar commented Aug 3, 2020

hameerabbasi commented Aug 4, 2020

hameerabbasi commented Aug 4, 2020

daletovar commented Aug 4, 2020

daletovar commented Aug 4, 2020

daletovar commented Aug 4, 2020

lgtm-com bot commented Aug 5, 2020

lgtm-com bot commented Aug 5, 2020

lgtm-com bot commented Aug 5, 2020

daletovar commented Aug 5, 2020

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

hameerabbasi left a comment

Choose a reason for hiding this comment

daletovar commented Aug 6, 2020

Choose a reason for hiding this comment

hameerabbasi commented Aug 8, 2020 • edited Loading

daletovar commented Aug 9, 2020

hameerabbasi commented Aug 9, 2020

hameerabbasi commented Aug 8, 2020 •

edited

Loading