IA3 adaptors #403

IanMagnusson · 2022-09-14T01:46:26Z

Changes proposed in this pull request:

Adds a function to modify a Hugging Face transformer with IA3 adaptors

Results on piqa

A related PR in catwalk implements an example of how these adaptors can be trained. While hardly impressive results, the IA3 implementation manages to reduce validation loss and recover much of the accuracy of the fully tuned equivalent for all the architectures for which default configurations are provided. The gpt-j-6b full tune is not able to run on a single gpu while the IA3 training is able to fit due to having far fewer optimizer states for its fewer trainable parameters.

Before submitting

I've read and followed all steps in the Making a pull request
section of the CONTRIBUTING docs.
I've updated or added any relevant docstrings following the syntax described in the
Writing docstrings section of the CONTRIBUTING docs.
If this PR fixes a bug, I've added a test that will fail without my fix.
If this PR adds a new feature, I've added tests that sufficiently cover my new functionality.

After submitting

All GitHub Actions jobs for my pull request have passed.

AkshitaB

Only minor comments.

tango/integrations/transformers/ia3.py

…tors

dirkgr · 2022-09-17T00:06:21Z

tango/integrations/transformers/ia3.py

+    :param config:
+        A :class:`~tango.integrations.transformers.ia3.WithIA3Config` that specifies the layers to modify.


Is there any chance we could automatically detect the right config, at least in some cases?

So we could make it look up the known configs in MODEL_NAME_TO_CONFIG in tango/integrations/transformers/ia3.py.

But if you mean trying to figure out a config from scratch just based on looking at the model architecture, that might be pretty difficult. Just from the few that we support right now they all have very different names for the layers we need. And there's enough variation in how the nodes can be nested that we can't find the layers we need by just looking for a Linear layer with a certain position in the model graph.

No, I mean, can we just find the name of the model given the model, and do the lookup that way?

I checked: We can look up transformer_model.config.name_or_path. If it matches anything in the dictionary of configs, we can use it automatically.

Yup working on a commit to do that! c1e27f1

Okay it's working now and I also updated the Catwalk end to use this: allenai/catwalk@9800e12

…tors

tango/integrations/transformers/ia3.py

tests/integrations/transformers/ia3_test.py

dirkgr · 2022-09-17T00:59:48Z

tests/integrations/transformers/ia3_test.py

+
+    input_seq = tokenizer(["A tiny test on a tiny model."], return_tensors="pt")
+
+    model = AutoModelForCausalLM.from_pretrained(model_name)


I'm surprised this test works, since you're not setting the models into eval() mode.

Oops good catch!

Co-authored-by: Dirk Groeneveld <dirkg@allenai.org>

IanMagnusson added 4 commits September 13, 2022 17:01

ia3 adaptors

b15d85a

docs and style fixes

90642ec

update changelog

2d9ecc1

Merge branch 'main' into ia3-adaptors

3579edc

IanMagnusson mentioned this pull request Sep 14, 2022

Generalized ia3 allenai/catwalk#81

Open

IanMagnusson marked this pull request as ready for review September 14, 2022 02:07

IanMagnusson requested review from dirkgr and AkshitaB September 14, 2022 02:07

AkshitaB requested changes Sep 15, 2022

View reviewed changes

tango/integrations/transformers/ia3.py Outdated Show resolved Hide resolved

tango/integrations/transformers/ia3.py Show resolved Hide resolved

tango/integrations/transformers/ia3.py Outdated Show resolved Hide resolved

IanMagnusson and others added 4 commits September 14, 2022 22:48

docs and refactoring

34e2ab1

Merge branch 'main' into ia3-adaptors

8ce735e

style fix

b825a41

Merge branch 'ia3-adaptors' of github.com:allenai/tango into ia3-adap…

0fa9e68

…tors

IanMagnusson requested a review from AkshitaB September 15, 2022 16:04

IanMagnusson enabled auto-merge (squash) September 15, 2022 16:05

Merge branch 'main' into ia3-adaptors

68db503

dirkgr reviewed Sep 17, 2022

View reviewed changes

IanMagnusson added 5 commits September 16, 2022 17:25

style fixes

c1e27f1

Merge branch 'ia3-adaptors' of github.com:allenai/tango into ia3-adap…

3270754

…tors

more style fixes

22fb7f8

update test

75b24d9

more fixes

6be3b6f

dirkgr requested changes Sep 17, 2022

View reviewed changes

IanMagnusson and others added 2 commits September 16, 2022 18:14

Apply suggestions from code review

7c24fdc

Co-authored-by: Dirk Groeneveld <dirkg@allenai.org>

Set model to eval mode

1fe816e

dirkgr approved these changes Sep 19, 2022

View reviewed changes

AkshitaB approved these changes Sep 19, 2022

View reviewed changes

IanMagnusson merged commit 7382019 into main Sep 19, 2022

IanMagnusson deleted the ia3-adaptors branch September 19, 2022 19:09

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

IA3 adaptors #403

IA3 adaptors #403

IanMagnusson commented Sep 14, 2022 •

edited

Loading

AkshitaB left a comment

dirkgr Sep 17, 2022

IanMagnusson Sep 17, 2022

dirkgr Sep 17, 2022

dirkgr Sep 17, 2022

IanMagnusson Sep 17, 2022 •

edited

Loading

IanMagnusson Sep 17, 2022

dirkgr Sep 17, 2022

IanMagnusson Sep 17, 2022

		:param config:
		A :class:`~tango.integrations.transformers.ia3.WithIA3Config` that specifies the layers to modify.


		input_seq = tokenizer(["A tiny test on a tiny model."], return_tensors="pt")

		model = AutoModelForCausalLM.from_pretrained(model_name)

IA3 adaptors #403

IA3 adaptors #403

Conversation

IanMagnusson commented Sep 14, 2022 • edited Loading

Results on piqa

Before submitting

After submitting

AkshitaB left a comment

Choose a reason for hiding this comment

dirkgr Sep 17, 2022

Choose a reason for hiding this comment

IanMagnusson Sep 17, 2022

Choose a reason for hiding this comment

dirkgr Sep 17, 2022

Choose a reason for hiding this comment

dirkgr Sep 17, 2022

Choose a reason for hiding this comment

IanMagnusson Sep 17, 2022 • edited Loading

Choose a reason for hiding this comment

IanMagnusson Sep 17, 2022

Choose a reason for hiding this comment

dirkgr Sep 17, 2022

Choose a reason for hiding this comment

IanMagnusson Sep 17, 2022

Choose a reason for hiding this comment

IanMagnusson commented Sep 14, 2022 •

edited

Loading

IanMagnusson Sep 17, 2022 •

edited

Loading