Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add SparseLinear_v2, fixing indexing issues #754

Merged
merged 11 commits into from
Nov 17, 2022

Conversation

danieldk
Copy link
Contributor

@danieldk danieldk commented Sep 2, 2022

Introduce SparseLinear_v2 to fix indexing issues SparseLinear does not correctly index the gradient/weight matrix (#752). This change fixes the indexing, so that the full matrix is used.

To retain compatibility with existing models that use SparseLinear, which works relatively well if there are not too many hash collisions, the fixed version is renamed to SparseLinear_v2.


While at it, fix another indexing-related bug:

The output of MurMur hashes were mapped to array indices as follows:

idx = hash & (nr_weight-1)

This works well when nr_weight is a power of two. For instance,
if we have 16 buckets:

idx = hash & 15
idx = hash & 0b1111

However, when the user uses a bucket count that is not a power of two, this breaks down. For instance, if we have 15 buckets:

idx = hash & 14
idx = hash & 0b1110

This would mask out all odd indices. We fix this by using the modulus instead. To preserve compatibility with existing models, this change is only added to SparseLinear_v2.

`SparseLinear` does not correctly index the gradient/weight matrix
(explosion#752). This change fixes the indexing, so that the full matrix is
used.

To retain compatibility with existing models that use `SparseLinear`, which
works relatively well if there are not too many hash collisions, the fixed
version is renamed to `SparseLinear_v2`.

Thanks to @sriram7797 for reporting this issue!
The output of MurMur hashes were mapped to array indices as follows:

```
idx = hash & (nr_weight-1)
```

This works well when `nr_weight` is a power of two. For instance,
if we have 16 buckets:

```
idx = hash & 15
idx = hash & 0b1111
```

However, when the user uses a bucket count that is not a power of
two, this breaks down. For instance, if we have 15 buckets:

```
idx = hash & 14
idx = hash & 0b1110
```

This would mask out all odd indices. We fix this by using the modulus
instead. To preserve compatibility with existing models, this change
is only added to `SparseLinear_v2`.
@danieldk danieldk added bug Bugs and behaviour differing from documentation feat / layers Weights layers, transforms, combinators, wrappers labels Sep 2, 2022
@adrianeboyd
Copy link
Contributor

Do you really need to do the older version of the hash mapping at all?

@danieldk
Copy link
Contributor Author

danieldk commented Sep 2, 2022

Do you really need to do the older version of the hash mapping at all?

I don't know. If someone used

SparseLinear(length=not_a_power_of_2)

their models would break by changing from bit masking to modulo.

@adrianeboyd
Copy link
Contributor

But with this as SparseLinear_v2 I'm not sure where that would happen?

@danieldk
Copy link
Contributor Author

danieldk commented Sep 2, 2022

But with this as SparseLinear_v2 I'm not sure where that would happen?

I am not sure I follow? There may be existing uses for SparseLinear/SparseLinear_v1 out there with lengths that are not a power of 2? So, we can't really remove SparseLinear without breaking the API. But we can also not retroactively fix SparseLinear, because the indices would change (when a lengths that is not a power of two is used) and existing models would output garbage.

@adrianeboyd
Copy link
Contributor

Ah, I thought more was redefined than actually is. I think it would be clearer to call it something like v1_indexing.

@danieldk
Copy link
Contributor Author

danieldk commented Sep 2, 2022

Ah, I thought more was redefined than actually is. I think it would be clearer to call it something like v1_indexing.

Changed the name to v1_indexing.

@svlandeg
Copy link
Member

svlandeg commented Sep 6, 2022

You'll need to merge in the latest from master to avoid the CI failure with the macOS 10.15 environment.

@danieldk
Copy link
Contributor Author

danieldk commented Sep 7, 2022

You'll need to merge in the latest from master to avoid the CI failure with the macOS 10.15 environment.

Merged.

@shadeMe
Copy link
Collaborator

shadeMe commented Sep 7, 2022

@explosion-bot please test_slow_gpu

@explosion-bot
Copy link
Collaborator

explosion-bot commented Sep 7, 2022

🪁 Successfully triggered build on Buildkite

URL: https://buildkite.com/explosion-ai/thinc-slow-gpu-tests/builds/27

@svlandeg svlandeg linked an issue Sep 7, 2022 that may be closed by this pull request
website/docs/api-layers.md Outdated Show resolved Hide resolved
@adrianeboyd
Copy link
Contributor

Would the slow GPU tests pass after the changes Madeesh made for tensorflow?

@shadeMe
Copy link
Collaborator

shadeMe commented Oct 24, 2022

The fix for the failing tests is in this commit. So, performing a new rebase on master ought to sort out the GPU test failures.

@danieldk
Copy link
Contributor Author

@explosion-bot please test_slow_gpu

@explosion-bot
Copy link
Collaborator

explosion-bot commented Nov 14, 2022

🪁 Successfully triggered build on Buildkite

URL: https://buildkite.com/explosion-ai/thinc-slow-gpu-tests/builds/36

@danieldk
Copy link
Contributor Author

@explosion-bot please test_slow_gpu

@explosion-bot
Copy link
Collaborator

explosion-bot commented Nov 14, 2022

🪁 Successfully triggered build on Buildkite

URL: https://buildkite.com/explosion-ai/thinc-slow-gpu-tests/builds/37

@svlandeg svlandeg merged commit 2310d4e into explosion:master Nov 17, 2022
danieldk added a commit to danieldk/spaCy that referenced this pull request Nov 23, 2023
A while ago, we fixed the `SparseLinear` layer to use all available
parameters: explosion/thinc#754

This change updates `TextCatBOW` to `v3` which uses the new
`SparseLinear_v2` layer. This results in a sizeable improvement on a
text categorization task that was tested.

While at it, this `spacy.TextCatBOW.v3` also adds the `length_exponent`
option to make it possible to change the hidden size. Ideally, we'd just
have an option called `length`. But the way that `TextCatBOW` uses
hashes results in a non-uniform distribution of parameters when the
length is not a power of two.
svlandeg pushed a commit to explosion/spaCy that referenced this pull request Nov 29, 2023
* Update `TextCatBOW` to use the fixed `SparseLinear` layer

A while ago, we fixed the `SparseLinear` layer to use all available
parameters: explosion/thinc#754

This change updates `TextCatBOW` to `v3` which uses the new
`SparseLinear_v2` layer. This results in a sizeable improvement on a
text categorization task that was tested.

While at it, this `spacy.TextCatBOW.v3` also adds the `length_exponent`
option to make it possible to change the hidden size. Ideally, we'd just
have an option called `length`. But the way that `TextCatBOW` uses
hashes results in a non-uniform distribution of parameters when the
length is not a power of two.

* Replace TexCatBOW `length_exponent` parameter by `length`

We now round up the length to the next power of two if it isn't
a power of two.

* Remove some tests for TextCatBOW.v2

* Fix missing import
jordankanter pushed a commit to jordankanter/spaCy that referenced this pull request Mar 14, 2024
…13149)

* Update `TextCatBOW` to use the fixed `SparseLinear` layer

A while ago, we fixed the `SparseLinear` layer to use all available
parameters: explosion/thinc#754

This change updates `TextCatBOW` to `v3` which uses the new
`SparseLinear_v2` layer. This results in a sizeable improvement on a
text categorization task that was tested.

While at it, this `spacy.TextCatBOW.v3` also adds the `length_exponent`
option to make it possible to change the hidden size. Ideally, we'd just
have an option called `length`. But the way that `TextCatBOW` uses
hashes results in a non-uniform distribution of parameters when the
length is not a power of two.

* Replace TexCatBOW `length_exponent` parameter by `length`

We now round up the length to the next power of two if it isn't
a power of two.

* Remove some tests for TextCatBOW.v2

* Fix missing import
jordankanter pushed a commit to jordankanter/spaCy that referenced this pull request Mar 29, 2024
…13149)

* Update `TextCatBOW` to use the fixed `SparseLinear` layer

A while ago, we fixed the `SparseLinear` layer to use all available
parameters: explosion/thinc#754

This change updates `TextCatBOW` to `v3` which uses the new
`SparseLinear_v2` layer. This results in a sizeable improvement on a
text categorization task that was tested.

While at it, this `spacy.TextCatBOW.v3` also adds the `length_exponent`
option to make it possible to change the hidden size. Ideally, we'd just
have an option called `length`. But the way that `TextCatBOW` uses
hashes results in a non-uniform distribution of parameters when the
length is not a power of two.

* Replace TexCatBOW `length_exponent` parameter by `length`

We now round up the length to the next power of two if it isn't
a power of two.

* Remove some tests for TextCatBOW.v2

* Fix missing import
jordankanter pushed a commit to jordankanter/spaCy that referenced this pull request Apr 17, 2024
…13149)

* Update `TextCatBOW` to use the fixed `SparseLinear` layer

A while ago, we fixed the `SparseLinear` layer to use all available
parameters: explosion/thinc#754

This change updates `TextCatBOW` to `v3` which uses the new
`SparseLinear_v2` layer. This results in a sizeable improvement on a
text categorization task that was tested.

While at it, this `spacy.TextCatBOW.v3` also adds the `length_exponent`
option to make it possible to change the hidden size. Ideally, we'd just
have an option called `length`. But the way that `TextCatBOW` uses
hashes results in a non-uniform distribution of parameters when the
length is not a power of two.

* Replace TexCatBOW `length_exponent` parameter by `length`

We now round up the length to the next power of two if it isn't
a power of two.

* Remove some tests for TextCatBOW.v2

* Fix missing import
jordankanter pushed a commit to jordankanter/spaCy that referenced this pull request Apr 17, 2024
…13149)

* Update `TextCatBOW` to use the fixed `SparseLinear` layer

A while ago, we fixed the `SparseLinear` layer to use all available
parameters: explosion/thinc#754

This change updates `TextCatBOW` to `v3` which uses the new
`SparseLinear_v2` layer. This results in a sizeable improvement on a
text categorization task that was tested.

While at it, this `spacy.TextCatBOW.v3` also adds the `length_exponent`
option to make it possible to change the hidden size. Ideally, we'd just
have an option called `length`. But the way that `TextCatBOW` uses
hashes results in a non-uniform distribution of parameters when the
length is not a power of two.

* Replace TexCatBOW `length_exponent` parameter by `length`

We now round up the length to the next power of two if it isn't
a power of two.

* Remove some tests for TextCatBOW.v2

* Fix missing import
jordankanter pushed a commit to jordankanter/spaCy that referenced this pull request May 10, 2024
…13149)

* Update `TextCatBOW` to use the fixed `SparseLinear` layer

A while ago, we fixed the `SparseLinear` layer to use all available
parameters: explosion/thinc#754

This change updates `TextCatBOW` to `v3` which uses the new
`SparseLinear_v2` layer. This results in a sizeable improvement on a
text categorization task that was tested.

While at it, this `spacy.TextCatBOW.v3` also adds the `length_exponent`
option to make it possible to change the hidden size. Ideally, we'd just
have an option called `length`. But the way that `TextCatBOW` uses
hashes results in a non-uniform distribution of parameters when the
length is not a power of two.

* Replace TexCatBOW `length_exponent` parameter by `length`

We now round up the length to the next power of two if it isn't
a power of two.

* Remove some tests for TextCatBOW.v2

* Fix missing import
jordankanter pushed a commit to jordankanter/spaCy that referenced this pull request May 10, 2024
…13149)

* Update `TextCatBOW` to use the fixed `SparseLinear` layer

A while ago, we fixed the `SparseLinear` layer to use all available
parameters: explosion/thinc#754

This change updates `TextCatBOW` to `v3` which uses the new
`SparseLinear_v2` layer. This results in a sizeable improvement on a
text categorization task that was tested.

While at it, this `spacy.TextCatBOW.v3` also adds the `length_exponent`
option to make it possible to change the hidden size. Ideally, we'd just
have an option called `length`. But the way that `TextCatBOW` uses
hashes results in a non-uniform distribution of parameters when the
length is not a power of two.

* Replace TexCatBOW `length_exponent` parameter by `length`

We now round up the length to the next power of two if it isn't
a power of two.

* Remove some tests for TextCatBOW.v2

* Fix missing import
jordankanter pushed a commit to jordankanter/spaCy that referenced this pull request May 10, 2024
…13149)

* Update `TextCatBOW` to use the fixed `SparseLinear` layer

A while ago, we fixed the `SparseLinear` layer to use all available
parameters: explosion/thinc#754

This change updates `TextCatBOW` to `v3` which uses the new
`SparseLinear_v2` layer. This results in a sizeable improvement on a
text categorization task that was tested.

While at it, this `spacy.TextCatBOW.v3` also adds the `length_exponent`
option to make it possible to change the hidden size. Ideally, we'd just
have an option called `length`. But the way that `TextCatBOW` uses
hashes results in a non-uniform distribution of parameters when the
length is not a power of two.

* Replace TexCatBOW `length_exponent` parameter by `length`

We now round up the length to the next power of two if it isn't
a power of two.

* Remove some tests for TextCatBOW.v2

* Fix missing import
jordankanter pushed a commit to jordankanter/spaCy that referenced this pull request May 10, 2024
…13149)

* Update `TextCatBOW` to use the fixed `SparseLinear` layer

A while ago, we fixed the `SparseLinear` layer to use all available
parameters: explosion/thinc#754

This change updates `TextCatBOW` to `v3` which uses the new
`SparseLinear_v2` layer. This results in a sizeable improvement on a
text categorization task that was tested.

While at it, this `spacy.TextCatBOW.v3` also adds the `length_exponent`
option to make it possible to change the hidden size. Ideally, we'd just
have an option called `length`. But the way that `TextCatBOW` uses
hashes results in a non-uniform distribution of parameters when the
length is not a power of two.

* Replace TexCatBOW `length_exponent` parameter by `length`

We now round up the length to the next power of two if it isn't
a power of two.

* Remove some tests for TextCatBOW.v2

* Fix missing import
jordankanter pushed a commit to jordankanter/spaCy that referenced this pull request May 21, 2024
…13149)

* Update `TextCatBOW` to use the fixed `SparseLinear` layer

A while ago, we fixed the `SparseLinear` layer to use all available
parameters: explosion/thinc#754

This change updates `TextCatBOW` to `v3` which uses the new
`SparseLinear_v2` layer. This results in a sizeable improvement on a
text categorization task that was tested.

While at it, this `spacy.TextCatBOW.v3` also adds the `length_exponent`
option to make it possible to change the hidden size. Ideally, we'd just
have an option called `length`. But the way that `TextCatBOW` uses
hashes results in a non-uniform distribution of parameters when the
length is not a power of two.

* Replace TexCatBOW `length_exponent` parameter by `length`

We now round up the length to the next power of two if it isn't
a power of two.

* Remove some tests for TextCatBOW.v2

* Fix missing import
jordankanter pushed a commit to jordankanter/spaCy that referenced this pull request May 21, 2024
…13149)

* Update `TextCatBOW` to use the fixed `SparseLinear` layer

A while ago, we fixed the `SparseLinear` layer to use all available
parameters: explosion/thinc#754

This change updates `TextCatBOW` to `v3` which uses the new
`SparseLinear_v2` layer. This results in a sizeable improvement on a
text categorization task that was tested.

While at it, this `spacy.TextCatBOW.v3` also adds the `length_exponent`
option to make it possible to change the hidden size. Ideally, we'd just
have an option called `length`. But the way that `TextCatBOW` uses
hashes results in a non-uniform distribution of parameters when the
length is not a power of two.

* Replace TexCatBOW `length_exponent` parameter by `length`

We now round up the length to the next power of two if it isn't
a power of two.

* Remove some tests for TextCatBOW.v2

* Fix missing import
jordankanter pushed a commit to jordankanter/spaCy that referenced this pull request May 21, 2024
…13149)

* Update `TextCatBOW` to use the fixed `SparseLinear` layer

A while ago, we fixed the `SparseLinear` layer to use all available
parameters: explosion/thinc#754

This change updates `TextCatBOW` to `v3` which uses the new
`SparseLinear_v2` layer. This results in a sizeable improvement on a
text categorization task that was tested.

While at it, this `spacy.TextCatBOW.v3` also adds the `length_exponent`
option to make it possible to change the hidden size. Ideally, we'd just
have an option called `length`. But the way that `TextCatBOW` uses
hashes results in a non-uniform distribution of parameters when the
length is not a power of two.

* Replace TexCatBOW `length_exponent` parameter by `length`

We now round up the length to the next power of two if it isn't
a power of two.

* Remove some tests for TextCatBOW.v2

* Fix missing import
jordankanter pushed a commit to jordankanter/spaCy that referenced this pull request May 21, 2024
…13149)

* Update `TextCatBOW` to use the fixed `SparseLinear` layer

A while ago, we fixed the `SparseLinear` layer to use all available
parameters: explosion/thinc#754

This change updates `TextCatBOW` to `v3` which uses the new
`SparseLinear_v2` layer. This results in a sizeable improvement on a
text categorization task that was tested.

While at it, this `spacy.TextCatBOW.v3` also adds the `length_exponent`
option to make it possible to change the hidden size. Ideally, we'd just
have an option called `length`. But the way that `TextCatBOW` uses
hashes results in a non-uniform distribution of parameters when the
length is not a power of two.

* Replace TexCatBOW `length_exponent` parameter by `length`

We now round up the length to the next power of two if it isn't
a power of two.

* Remove some tests for TextCatBOW.v2

* Fix missing import
jordankanter pushed a commit to jordankanter/spaCy that referenced this pull request May 21, 2024
…13149)

* Update `TextCatBOW` to use the fixed `SparseLinear` layer

A while ago, we fixed the `SparseLinear` layer to use all available
parameters: explosion/thinc#754

This change updates `TextCatBOW` to `v3` which uses the new
`SparseLinear_v2` layer. This results in a sizeable improvement on a
text categorization task that was tested.

While at it, this `spacy.TextCatBOW.v3` also adds the `length_exponent`
option to make it possible to change the hidden size. Ideally, we'd just
have an option called `length`. But the way that `TextCatBOW` uses
hashes results in a non-uniform distribution of parameters when the
length is not a power of two.

* Replace TexCatBOW `length_exponent` parameter by `length`

We now round up the length to the next power of two if it isn't
a power of two.

* Remove some tests for TextCatBOW.v2

* Fix missing import
jordankanter pushed a commit to jordankanter/spaCy that referenced this pull request May 21, 2024
…13149)

* Update `TextCatBOW` to use the fixed `SparseLinear` layer

A while ago, we fixed the `SparseLinear` layer to use all available
parameters: explosion/thinc#754

This change updates `TextCatBOW` to `v3` which uses the new
`SparseLinear_v2` layer. This results in a sizeable improvement on a
text categorization task that was tested.

While at it, this `spacy.TextCatBOW.v3` also adds the `length_exponent`
option to make it possible to change the hidden size. Ideally, we'd just
have an option called `length`. But the way that `TextCatBOW` uses
hashes results in a non-uniform distribution of parameters when the
length is not a power of two.

* Replace TexCatBOW `length_exponent` parameter by `length`

We now round up the length to the next power of two if it isn't
a power of two.

* Remove some tests for TextCatBOW.v2

* Fix missing import
jordankanter pushed a commit to jordankanter/spaCy that referenced this pull request May 21, 2024
…13149)

* Update `TextCatBOW` to use the fixed `SparseLinear` layer

A while ago, we fixed the `SparseLinear` layer to use all available
parameters: explosion/thinc#754

This change updates `TextCatBOW` to `v3` which uses the new
`SparseLinear_v2` layer. This results in a sizeable improvement on a
text categorization task that was tested.

While at it, this `spacy.TextCatBOW.v3` also adds the `length_exponent`
option to make it possible to change the hidden size. Ideally, we'd just
have an option called `length`. But the way that `TextCatBOW` uses
hashes results in a non-uniform distribution of parameters when the
length is not a power of two.

* Replace TexCatBOW `length_exponent` parameter by `length`

We now round up the length to the next power of two if it isn't
a power of two.

* Remove some tests for TextCatBOW.v2

* Fix missing import
jordankanter pushed a commit to jordankanter/spaCy that referenced this pull request May 21, 2024
…13149)

* Update `TextCatBOW` to use the fixed `SparseLinear` layer

A while ago, we fixed the `SparseLinear` layer to use all available
parameters: explosion/thinc#754

This change updates `TextCatBOW` to `v3` which uses the new
`SparseLinear_v2` layer. This results in a sizeable improvement on a
text categorization task that was tested.

While at it, this `spacy.TextCatBOW.v3` also adds the `length_exponent`
option to make it possible to change the hidden size. Ideally, we'd just
have an option called `length`. But the way that `TextCatBOW` uses
hashes results in a non-uniform distribution of parameters when the
length is not a power of two.

* Replace TexCatBOW `length_exponent` parameter by `length`

We now round up the length to the next power of two if it isn't
a power of two.

* Remove some tests for TextCatBOW.v2

* Fix missing import
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Bugs and behaviour differing from documentation feat / layers Weights layers, transforms, combinators, wrappers
Projects
None yet
Development

Successfully merging this pull request may close these issues.

SparseLinear does not look up weights correctly
5 participants