Add `SparseLinear_v2`, fixing indexing issues #754

danieldk · 2022-09-02T08:14:05Z

Introduce SparseLinear_v2 to fix indexing issues SparseLinear does not correctly index the gradient/weight matrix (#752). This change fixes the indexing, so that the full matrix is used.

To retain compatibility with existing models that use SparseLinear, which works relatively well if there are not too many hash collisions, the fixed version is renamed to SparseLinear_v2.

While at it, fix another indexing-related bug:

The output of MurMur hashes were mapped to array indices as follows:

idx = hash & (nr_weight-1)

This works well when nr_weight is a power of two. For instance,
if we have 16 buckets:

idx = hash & 15
idx = hash & 0b1111

However, when the user uses a bucket count that is not a power of two, this breaks down. For instance, if we have 15 buckets:

idx = hash & 14
idx = hash & 0b1110

This would mask out all odd indices. We fix this by using the modulus instead. To preserve compatibility with existing models, this change is only added to SparseLinear_v2.

@sriram7797

`SparseLinear` does not correctly index the gradient/weight matrix (explosion#752). This change fixes the indexing, so that the full matrix is used. To retain compatibility with existing models that use `SparseLinear`, which works relatively well if there are not too many hash collisions, the fixed version is renamed to `SparseLinear_v2`. Thanks to @sriram7797 for reporting this issue!

The output of MurMur hashes were mapped to array indices as follows: ``` idx = hash & (nr_weight-1) ``` This works well when `nr_weight` is a power of two. For instance, if we have 16 buckets: ``` idx = hash & 15 idx = hash & 0b1111 ``` However, when the user uses a bucket count that is not a power of two, this breaks down. For instance, if we have 15 buckets: ``` idx = hash & 14 idx = hash & 0b1110 ``` This would mask out all odd indices. We fix this by using the modulus instead. To preserve compatibility with existing models, this change is only added to `SparseLinear_v2`.

adrianeboyd · 2022-09-02T08:22:20Z

Do you really need to do the older version of the hash mapping at all?

danieldk · 2022-09-02T08:36:48Z

Do you really need to do the older version of the hash mapping at all?

I don't know. If someone used

SparseLinear(length=not_a_power_of_2)

their models would break by changing from bit masking to modulo.

adrianeboyd · 2022-09-02T08:40:09Z

But with this as SparseLinear_v2 I'm not sure where that would happen?

danieldk · 2022-09-02T08:48:45Z

But with this as SparseLinear_v2 I'm not sure where that would happen?

I am not sure I follow? There may be existing uses for SparseLinear/SparseLinear_v1 out there with lengths that are not a power of 2? So, we can't really remove SparseLinear without breaking the API. But we can also not retroactively fix SparseLinear, because the indices would change (when a lengths that is not a power of two is used) and existing models would output garbage.

thinc/layers/sparselinear.pyx

adrianeboyd · 2022-09-02T09:05:54Z

Ah, I thought more was redefined than actually is. I think it would be clearer to call it something like v1_indexing.

…xing-fix

danieldk · 2022-09-02T17:04:17Z

Ah, I thought more was redefined than actually is. I think it would be clearer to call it something like v1_indexing.

Changed the name to v1_indexing.

svlandeg · 2022-09-06T13:42:47Z

You'll need to merge in the latest from master to avoid the CI failure with the macOS 10.15 environment.

…xing-fix

danieldk · 2022-09-07T08:07:20Z

You'll need to merge in the latest from master to avoid the CI failure with the macOS 10.15 environment.

Merged.

shadeMe · 2022-09-07T08:36:01Z

@explosion-bot please test_slow_gpu

explosion-bot · 2022-09-07T08:36:35Z

🪁 Successfully triggered build on Buildkite

URL: https://buildkite.com/explosion-ai/thinc-slow-gpu-tests/builds/27

thinc/tests/layers/test_layers_api.py

website/docs/api-layers.md

adrianeboyd · 2022-10-24T08:48:41Z

Would the slow GPU tests pass after the changes Madeesh made for tensorflow?

shadeMe · 2022-10-24T09:33:41Z

The fix for the failing tests is in this commit. So, performing a new rebase on master ought to sort out the GPU test failures.

…xing-fix

danieldk · 2022-11-14T14:02:15Z

@explosion-bot please test_slow_gpu

explosion-bot · 2022-11-14T14:02:46Z

🪁 Successfully triggered build on Buildkite

URL: https://buildkite.com/explosion-ai/thinc-slow-gpu-tests/builds/36

danieldk · 2022-11-14T16:08:36Z

@explosion-bot please test_slow_gpu

explosion-bot · 2022-11-14T16:08:58Z

🪁 Successfully triggered build on Buildkite

URL: https://buildkite.com/explosion-ai/thinc-slow-gpu-tests/builds/37

A while ago, we fixed the `SparseLinear` layer to use all available parameters: explosion/thinc#754 This change updates `TextCatBOW` to `v3` which uses the new `SparseLinear_v2` layer. This results in a sizeable improvement on a text categorization task that was tested. While at it, this `spacy.TextCatBOW.v3` also adds the `length_exponent` option to make it possible to change the hidden size. Ideally, we'd just have an option called `length`. But the way that `TextCatBOW` uses hashes results in a non-uniform distribution of parameters when the length is not a power of two.

* Update `TextCatBOW` to use the fixed `SparseLinear` layer A while ago, we fixed the `SparseLinear` layer to use all available parameters: explosion/thinc#754 This change updates `TextCatBOW` to `v3` which uses the new `SparseLinear_v2` layer. This results in a sizeable improvement on a text categorization task that was tested. While at it, this `spacy.TextCatBOW.v3` also adds the `length_exponent` option to make it possible to change the hidden size. Ideally, we'd just have an option called `length`. But the way that `TextCatBOW` uses hashes results in a non-uniform distribution of parameters when the length is not a power of two. * Replace TexCatBOW `length_exponent` parameter by `length` We now round up the length to the next power of two if it isn't a power of two. * Remove some tests for TextCatBOW.v2 * Fix missing import

…13149) * Update `TextCatBOW` to use the fixed `SparseLinear` layer A while ago, we fixed the `SparseLinear` layer to use all available parameters: explosion/thinc#754 This change updates `TextCatBOW` to `v3` which uses the new `SparseLinear_v2` layer. This results in a sizeable improvement on a text categorization task that was tested. While at it, this `spacy.TextCatBOW.v3` also adds the `length_exponent` option to make it possible to change the hidden size. Ideally, we'd just have an option called `length`. But the way that `TextCatBOW` uses hashes results in a non-uniform distribution of parameters when the length is not a power of two. * Replace TexCatBOW `length_exponent` parameter by `length` We now round up the length to the next power of two if it isn't a power of two. * Remove some tests for TextCatBOW.v2 * Fix missing import

danieldk added 2 commits September 2, 2022 10:11

danieldk added bug Bugs and behaviour differing from documentation feat / layers Weights layers, transforms, combinators, wrappers labels Sep 2, 2022

shadeMe reviewed Sep 2, 2022

View reviewed changes

thinc/layers/sparselinear.pyx Outdated Show resolved Hide resolved

danieldk added 4 commits September 2, 2022 18:53

Rename invalid_indexing to v1_indexing

7aba365

Add comment about v1 indexing

1da488e

Merge remote-tracking branch 'upstream/master' into sparselinear-inde…

90944d7

…xing-fix

Fix incorrect merge fix

1c813c4

Merge remote-tracking branch 'upstream/master' into sparselinear-inde…

e2176e0

…xing-fix

shadeMe approved these changes Sep 7, 2022

View reviewed changes

svlandeg reviewed Sep 7, 2022

View reviewed changes

thinc/tests/layers/test_layers_api.py Show resolved Hide resolved

svlandeg linked an issue Sep 7, 2022 that may be closed by this pull request

SparseLinear does not look up weights correctly #752

Closed

svlandeg reviewed Sep 7, 2022

View reviewed changes

website/docs/api-layers.md Outdated Show resolved Hide resolved

danieldk added 3 commits November 14, 2022 11:37

Merge remote-tracking branch 'upstream/master' into sparselinear-inde…

54b075d

…xing-fix

Add the new tag to the docs

682a1c0

Check that the corrected hash function has the expected distribution

4892c42

Symbol export fixes

fa19f12

svlandeg merged commit 2310d4e into explosion:master Nov 17, 2022

danieldk mentioned this pull request Nov 23, 2023

Update TextCatBOW to use the fixed SparseLinear layer explosion/spaCy#13149

Merged

3 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add `SparseLinear_v2`, fixing indexing issues #754

Add `SparseLinear_v2`, fixing indexing issues #754

danieldk commented Sep 2, 2022

adrianeboyd commented Sep 2, 2022

danieldk commented Sep 2, 2022

adrianeboyd commented Sep 2, 2022

danieldk commented Sep 2, 2022 •

edited

Loading

adrianeboyd commented Sep 2, 2022

danieldk commented Sep 2, 2022

svlandeg commented Sep 6, 2022

danieldk commented Sep 7, 2022

shadeMe commented Sep 7, 2022

explosion-bot commented Sep 7, 2022 •

edited

Loading

adrianeboyd commented Oct 24, 2022

shadeMe commented Oct 24, 2022

danieldk commented Nov 14, 2022

explosion-bot commented Nov 14, 2022 •

edited

Loading

danieldk commented Nov 14, 2022

explosion-bot commented Nov 14, 2022 •

edited

Loading

Add SparseLinear_v2, fixing indexing issues #754

Add SparseLinear_v2, fixing indexing issues #754

Conversation

danieldk commented Sep 2, 2022

adrianeboyd commented Sep 2, 2022

danieldk commented Sep 2, 2022

adrianeboyd commented Sep 2, 2022

danieldk commented Sep 2, 2022 • edited Loading

adrianeboyd commented Sep 2, 2022

danieldk commented Sep 2, 2022

svlandeg commented Sep 6, 2022

danieldk commented Sep 7, 2022

shadeMe commented Sep 7, 2022

explosion-bot commented Sep 7, 2022 • edited Loading

adrianeboyd commented Oct 24, 2022

shadeMe commented Oct 24, 2022

danieldk commented Nov 14, 2022

explosion-bot commented Nov 14, 2022 • edited Loading

danieldk commented Nov 14, 2022

explosion-bot commented Nov 14, 2022 • edited Loading

Add `SparseLinear_v2`, fixing indexing issues #754

Add `SparseLinear_v2`, fixing indexing issues #754

danieldk commented Sep 2, 2022 •

edited

Loading

explosion-bot commented Sep 7, 2022 •

edited

Loading

explosion-bot commented Nov 14, 2022 •

edited

Loading

explosion-bot commented Nov 14, 2022 •

edited

Loading