Add CachedGISTEmbedLoss #2592

JacksonCakes · 2024-04-13T17:08:10Z

As per discussed in #2583, this is the implementation of GradCache version of GISTEmbedLoss to reduce memory usage while maintaining performance levels comparable to those of GISTEmbedLoss.

tomaarsen · 2024-04-15T07:46:55Z

Hello!

Thanks a bunch for this! I've been testing this alongside MNRL, CMNRL and normal GIST yesterday and today. It seems to roughly match their performance, though I'm using some simple training & testing data.
I'll assist with finishing up this PR today, which mostly involves making sure that CachedGISTEmbedLoss is mentioned in the documentation.

Tom Aarsen

JacksonCakes · 2024-04-15T08:19:11Z

Hi! Sorry, I just realized that calculating similarity for the entire batch in the guide, instead of using mini-batches, also adds extra memory usage. That's not ideal since I can handle up to a batch size of 4096 with CMNR loss, but only 1024 for this in my own test. I've made some adjustments to the guided part to address this.

tomaarsen · 2024-04-15T08:55:45Z

I think there's a small issue with 3208e61, the loss is always 0 it seems. These are some of my logs:

(Green is 3208e61, Salmon is 5c054da)

Update: The evaluation performance does go up over time, so I suspect that the loss is not actually 0, it's just VERY small (such that it rounds to 0.00 in my logs). That said, being so small likely results in underflow/inaccuracies, and the evaluation loss is notably worse than before:

Additionally, the memory usage is actually a tad higher:

But that might also be somehow related to the 0 loss.

Tom Aarsen

JacksonCakes · 2024-04-15T10:04:04Z

Ah my bad, i think it's because I did not properly offset the diagonal part resulting in the guide mask always select diagonally from the first element, which easily causing -inf in the scores and potentially leading to weird loss.

tomaarsen · 2024-04-16T08:44:31Z

After further experimentation, I can confirm that 3215c06 matches the performance (losses, evaluations) of 5c054da exactly, but the former allows for much higher batch sizes. E.g. I was able to set my batch size to an absurd 50k.

Great job!

I'll work on the documentation things that I had mentioned.

Tom Aarsen

JacksonCakes · 2024-04-16T09:04:22Z

Good to know! Thank you for your effort! Happy to help :)

tomaarsen · 2024-04-16T18:52:54Z

I think this is ready! I'll merge it now, so it can be included in tomorrow/Thursday's release. Do feel free to let me know if there were things that you think are missing/suboptimal.

Thanks for your time/work on this!

cc @avsolatorio, @kwang2049

Tom Aarsen

JacksonCakes added 6 commits April 13, 2024 17:38

Add CachedGISTEmbedLoss to __init__ to allow for easier import

7aa827c

Add intiial implementation

5e05045

Add initial implementation

98b6ecf

Add docstring

f4d7f38

Remove unused imports

263d21c

Sync with main repo

5c054da

Update guided similarity computation in mini-batch

3208e61

Fix guiding mask by adding offset

3215c06

tomaarsen added 3 commits April 16, 2024 10:53

Write docstring on multiple lines

6ccf388

Add CachedGIST to loss API

6a6455d

Add CachedGIST to Loss Overview

fccc92a

tomaarsen merged commit 38ab549 into UKPLab:master Apr 16, 2024
9 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add CachedGISTEmbedLoss #2592

Add CachedGISTEmbedLoss #2592

JacksonCakes commented Apr 13, 2024

tomaarsen commented Apr 15, 2024

JacksonCakes commented Apr 15, 2024 •

edited

Loading

tomaarsen commented Apr 15, 2024 •

edited

Loading

JacksonCakes commented Apr 15, 2024

tomaarsen commented Apr 16, 2024

JacksonCakes commented Apr 16, 2024

tomaarsen commented Apr 16, 2024

Add CachedGISTEmbedLoss #2592

Add CachedGISTEmbedLoss #2592

Conversation

JacksonCakes commented Apr 13, 2024

tomaarsen commented Apr 15, 2024

JacksonCakes commented Apr 15, 2024 • edited Loading

tomaarsen commented Apr 15, 2024 • edited Loading

JacksonCakes commented Apr 15, 2024

tomaarsen commented Apr 16, 2024

JacksonCakes commented Apr 16, 2024

tomaarsen commented Apr 16, 2024

JacksonCakes commented Apr 15, 2024 •

edited

Loading

tomaarsen commented Apr 15, 2024 •

edited

Loading